Squid: Cache Youtube dibantu Nginx
Youtube caching with Squid + Nginx
Filed under: Linux Related — Tags: nginx cache, youtube cache, youtube cache with nginx — Syed Jahanzaib / Pinochio~:) @ 11:52 AM
31 Votes
Advantages of Youtube Caching !!!
In most part of the world, bandwidth is very expensive, therefore it is (in some scenarios) very useful to Cache Youtube videos or any other flash videos, so if one of user downloads video / flash file , why again the same user or other user can’t download the same file from the CACHE, why he sucking the internet pipe for same content again n again? Peoples on same LAN ,sometimes watch similar videos. If I put some youtube video link on on FACEBOOK, TWITTER or likewise , and all my friend will watch that video and that particular video gets viewed many times in few hours. Usually the videos are shared over facebook or other social networking sites so the chances are high for multiple hits per popular videos for my LAN users / friends.
This is the reason why I wrote this article. Disadvantages of Youtube Caching !!!
The chances, that another user will watch the same video, is really slim. if I search for something specific on youtube, i get more then hundreds of search results for same video. What is the chance that another user will search for the same thing, and will click on the same link / result? Youtube hosts more than 10 million videos. Which is too much to cache anyway. You need lot of space to cache videos. Also accordingly you will be needing ultra modern fast hardware with tons of SPACE to handle such kind of cache giant. anyhow Try it
AFAIK you are not supposed to cache youtube videos, youtube don’t like it. I don’t understand why. Probably because their ranking mechanism relies on views, and possibly completed views, which wouldn’t be measurable if the content was served from a local cache.
After unsuccessful struggling with storeurl.pl method , I was searching for alternate method to cache youtube videos. Finally I found ruby base method using Nginx to cache YT. Using this method I was able to cache all Youtube videos almost perfectly. (not 100%, but it works fine in most cases with some modification.I am sure there will be some improvement in near future). Updated: 24thth August, 2012
Thanks to Mr. Eliezer Croitoru & Mr.Christian Loth & others for there kind guidance.
Following components were used in this guide.
Proxy Server Configuration: Ubuntu Desktop 10.4 Nginix version: nginx/0.7.65 Squid Cache: Version 2.7.STABLE7
Client Configuration for testing videos: Windows XP with Internet Explorer 6 Windows 7 with Internet Explorer 8
Lets start with the Proxy Server Configuration: 1) Update Ubuntu
First install Ubuntu, After installation, configure its networking components, then update it by following command apt-get install update 2) Install SSH Server [Optional]
Now install SSH server so that you can manage your server remotely using PUTTY or any other ssh tool.
apt-get install openssh-server 3) Install Squid Server
Now install Squid Server by following command apt-get install squid [This will install squid 2.7 by default]
Now edit squid configuration files by using following command
nano /etc/squid/squid.conf
Remove all lines and paste the following data 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97
- SQUID 2.7/ Nginx TEST CONFIG FILE
- Email: aacable@hotmail.com
- Web : http://aacable.wordpress.com
- PORT and Transparent Option
http_port 8080 transparent server_http11 on icp_port 0
- Cache is set to 5GB in this example (zaib)
store_dir_select_algorithm round-robin cache_dir aufs /cache1 5000 16 256 cache_replacement_policy heap LFUDA memory_replacement_policy heap LFUDA
- If you want to enable DATE time n SQUID Logs,use following
emulate_httpd_log on logformat squid %tl %6tr %>a %Ss/%03Hs %<st %rm %ru %un %Sh/%<A %mt log_fqdn off
- How much days to keep users access web logs
- You need to rotate your log files with a cron job. For example:
- 0 0 * * * /usr/local/squid/bin/squid -k rotate
logfile_rotate 14 debug_options ALL,1 cache_access_log /var/log/squid/access.log cache_log /var/log/squid/cache.log cache_store_log /var/log/squid/store.log
- [zaib] I used DNSAMSQ service for fast dns resolving
- so install by using "apt-get install dnsmasq" first
dns_nameservers 127.0.0.1 221.132.112.8
- ACL Section
acl all src 0.0.0.0/0.0.0.0 acl manager proto cache_object acl localhost src 127.0.0.1/255.255.255.255 acl to_localhost dst 127.0.0.0/8 acl SSL_ports port 443 563 # https, snews acl SSL_ports port 873 # rsync acl Safe_ports port 80 # http acl Safe_ports port 21 # ftp acl Safe_ports port 443 563 # https, snews acl Safe_ports port 70 # gopher acl Safe_ports port 210 # wais acl Safe_ports port 1025-65535 # unregistered ports acl Safe_ports port 280 # http-mgmt acl Safe_ports port 488 # gss-http acl Safe_ports port 591 # filemaker acl Safe_ports port 777 # multiling http acl Safe_ports port 631 # cups acl Safe_ports port 873 # rsync acl Safe_ports port 901 # SWAT acl purge method PURGE acl CONNECT method CONNECT http_access allow manager localhost http_access deny manager http_access allow purge localhost http_access deny purge http_access deny !Safe_ports http_access deny CONNECT !SSL_ports http_access allow localhost http_access allow all http_reply_access allow all icp_access allow all
- [zaib]I used UBUNTU so user is proxy, in FEDORA you may use use squid
cache_effective_user proxy cache_effective_group proxy cache_mgr aacable@hotmail.com visible_hostname proxy.aacable.net unique_hostname aacable@hotmail.com
cache_mem 8 MB minimum_object_size 0 bytes maximum_object_size 100 MB maximum_object_size_in_memory 128 KB
refresh_pattern ^ftp: 1440 20% 10080 refresh_pattern ^gopher: 1440 0% 1440 refresh_pattern -i (/cgi-bin/|\?) 0 0% 0 refresh_pattern (Release|Packages(.gz)*)$ 0 20% 2880 refresh_pattern . 0 50% 4320 acl apache rep_header Server ^Apache broken_vary_encoding allow apache
- Youtube Cache Section [zaib]
url_rewrite_program /etc/nginx/nginx.rb url_rewrite_host_header off acl youtube_videos url_regex -i ^http://[^/]+\.youtube\.com/videoplayback\? acl range_request req_header Range . acl begin_param url_regex -i [?&]begin= acl id_param url_regex -i [?&]id= acl itag_param url_regex -i [?&]itag= acl sver3_param url_regex -i [?&]sver=3 cache_peer 127.0.0.1 parent 8081 0 proxy-only no-query connect-timeout=10 cache_peer_access 127.0.0.1 allow youtube_videos id_param itag_param sver3_param !begin_param !range_request cache_peer_access 127.0.0.1 deny all
Save & Exit. 4) Install Nginx
Now install Nginix by apt-get install nginx
Now edit its config file by using following command nano /etc/nginx/nginx.conf
Remove all lines and paste the following data 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57
- This config file is not written by me,
- My Email address is inserted Just for tracking purposes
- For more info, visit http://code.google.com/p/youtube-cache/
- Syed Jahanzaib / aacable [at] hotmail.com
user www-data; worker_processes 4; pid /var/run/nginx.pid; events { worker_connections 768; } http { sendfile on; tcp_nopush on; tcp_nodelay on; keepalive_timeout 65; types_hash_max_size 2048; include /etc/nginx/mime.types; default_type application/octet-stream; access_log /var/log/nginx/access.log; error_log /var/log/nginx/error.log; gzip on; gzip_static on; gzip_comp_level 6; gzip_disable .msie6.; gzip_vary on; gzip_types text/plain text/css text/xml text/javascript application/json application/x-javascript application/xml application/xml+rss; gzip_proxied expired no-cache no-store private auth; gzip_buffers 16 8k; gzip_http_version 1.1; include /etc/nginx/conf.d/*.conf; include /etc/nginx/sites-enabled/*;
- starting youtube section
server { listen 127.0.0.1:8081; location / { root /usr/local/www/nginx_cache/files;
- try_files "/id=$arg_id.itag=$arg_itag" @proxy_youtube; # Old one
- try_files "$uri" "/id=$arg_id.itag=$arg_itag.flv" "/id=$arg_id-range=$arg_range.itag=$arg_itag.flv" @proxy_youtube; #old2
try_files "/id=$arg_id.itag=$arg_itag.range=$arg_range.algo=$arg_algorithm" @proxy_youtube; } location @proxy_youtube { resolver 221.132.112.8; proxy_pass http://$host$request_uri; proxy_temp_path "/usr/local/www/nginx_cache/tmp";
- proxy_store "/usr/local/www/nginx_cache/files/id=$arg_id.itag=$arg_itag"; # Old 1
proxy_store "/usr/local/www/nginx_cache/files/id=$arg_id.itag=$arg_itag.range=$arg_range.algo=$arg_algorithm"; proxy_ignore_client_abort off; proxy_method GET; proxy_set_header X-YouTube-Cache "aacable@hotmail.com"; proxy_set_header Accept "video/*"; proxy_set_header User-Agent "YouTube Cacher (nginx)"; proxy_set_header Accept-Encoding ""; proxy_set_header Accept-Language ""; proxy_set_header Accept-Charset ""; proxy_set_header Cache-Control "";} } }
Save & Exit.
Now Create directories to hold cache files
mkdir /usr/local/www mkdir /usr/local/www/nginx_cache mkdir /usr/local/www/nginx_cache/tmp mkdir /usr/local/www/nginx_cache/files chown www-data /usr/local/www/nginx_cache/files/ -Rf
Now create nginx .rb file
touch /etc/nginx/nginx.rb chmod 755 /etc/nginx/nginx.rb nano /etc/nginx/nginx.rb
Paste the following data in this newly created file 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51
- !/usr/bin/env ruby1.8
- This script is not written by me,
- My Email address is inserted Just for tracking purposes
- For more info, visit http://code.google.com/p/youtube-cache/
- Syed Jahanzaib / aacable [at] hotmail.com
- url_rewrite_program <path>/nginx.rb
- url_rewrite_host_header off
require "syslog" require "base64"
class SquidRequest attr_accessor :url, :user attr_reader :client_ip, :method
def method=(s) @method = s.downcase end
def client_ip=(s) @client_ip = s.split('/').first end end
def read_requests
- URL <SP> client_ip "/" fqdn <SP> user <SP> method [<SP> kvpairs]<NL>
STDIN.each_line do |ln| r = SquidRequest.new r.url, r.client_ip, r.user, r.method, *dummy = ln.rstrip.split(' ') (STDOUT << "#{yield r}\n").flush end end
def log(msg) Syslog.log(Syslog::LOG_ERR, "%s", msg) end
def main Syslog.open('nginx.rb', Syslog::LOG_PID) log("Started")
read_requests do |r| if r.method == 'get' && r.url !~ /[?&]begin=/ && r.url =~ %r{\Ahttp://[^/]+\.youtube\.com/(videoplayback\?.*)\z} log("YouTube Video [#{r.url}].") "http://127.0.0.1:8081/#{$1}" else r.url end end end main
Save & Exit. 5) Install RUBY
What is RUBY? Ruby is a dynamic, open source programming language with a focus on simplicity and productivity. It has an elegant syntax that is natural to read and easy to write.
Now install RUBY by following command
apt-get install ruby 6) Configure Squid Cache DIR and Permissions
Now create cache dir and assign proper permission to proxy user
mkdir /cache1 chown proxy:proxy /cache1 chmod -R 777 /cache1
Now initialize squid cache directories by
squid -z
You should see Following message
Creating Swap Directories 7) Finally Start/restart SQUID & Nginx
service squid start service nginx restart
Now from test pc, open youtube and play any video, after it download completely, delete the browser cache, and play the same video again, This time it will be served from the cache. You can verify it by monitoring your WAN link utilization while playing the cached file.
Look at the below WAN utilization graph, it was taken while watching the clip which is not in cache
WAN utilization of Proxy, While watching New Clip (Not in cache)
Now Look at the below WAN utilization graph, it was taken while watching the clip which is now in CACHE.
WAN utilization of Proxy, While watching already cached Clip
Playing Video, loaded from the cache chunk by chunk
It will load first chunk from the cache, if the user keep watching the clip, it will load next chunk at the end of first chunk, and will continue to do so.
Video cache files can be found in following locations. /usr/local/www/nginx_cache/files
e.g:
ls -lh /usr/local/www/nginx_cache/files
The above file shows the clip is in 360p quality, and the length of the clip is 5:54 Seconds. itag=34 shows the video quality is 360p.
Credits: Thanks to Mr. Eliezer Croitoru & Mr.Christian Loth & others for there kind guidance.