Squid: Cache Youtube dibantu Nginx

From OnnoWiki
Jump to navigation Jump to search
Youtube caching with Squid + Nginx

Filed under: Linux Related — Tags: nginx cache, youtube cache, youtube cache with nginx — Syed Jahanzaib / Pinochio~:) @ 11:52 AM




31 Votes


Advantages of Youtube Caching !!!

In most part of the world, bandwidth is very expensive, therefore it is (in some scenarios) very useful to Cache Youtube videos or any other flash videos, so if one of user downloads video / flash file , why again the same user or other user can’t download the same file from the CACHE, why he sucking the internet pipe for same content again n again? Peoples on same LAN ,sometimes watch similar videos. If I put some youtube video link on on FACEBOOK, TWITTER or likewise , and all my friend will watch that video and that particular video gets viewed many times in few hours. Usually the videos are shared over facebook or other social networking sites so the chances are high for multiple hits per popular videos for my LAN users / friends.

This is the reason why I wrote this article. Disadvantages of Youtube Caching !!!

The chances, that another user will watch the same video, is really slim. if I search for something specific on youtube, i get more then hundreds of search results for same video. What is the chance that another user will search for the same thing, and will click on the same link / result? Youtube hosts more than 10 million videos. Which is too much to cache anyway. You need lot of space to cache videos. Also accordingly you will be needing ultra modern fast hardware with tons of SPACE to handle such kind of cache giant. anyhow Try it

AFAIK you are not supposed to cache youtube videos, youtube don’t like it. I don’t understand why. Probably because their ranking mechanism relies on views, and possibly completed views, which wouldn’t be measurable if the content was served from a local cache.

After unsuccessful struggling with storeurl.pl method , I was searching for alternate method to cache youtube videos. Finally I found ruby base method using Nginx to cache YT. Using this method I was able to cache all Youtube videos almost perfectly. (not 100%, but it works fine in most cases with some modification.I am sure there will be some improvement in near future). Updated: 24thth August, 2012

Thanks to Mr. Eliezer Croitoru & Mr.Christian Loth & others for there kind guidance.

Following components were used in this guide.

Proxy Server Configuration: Ubuntu Desktop 10.4 Nginix version: nginx/0.7.65 Squid Cache: Version 2.7.STABLE7

Client Configuration for testing videos: Windows XP with Internet Explorer 6 Windows 7 with Internet Explorer 8

Lets start with the Proxy Server Configuration: 1) Update Ubuntu

First install Ubuntu, After installation, configure its networking components, then update it by following command apt-get install update 2) Install SSH Server [Optional]

Now install SSH server so that you can manage your server remotely using PUTTY or any other ssh tool.

apt-get install openssh-server 3) Install Squid Server

Now install Squid Server by following command apt-get install squid [This will install squid 2.7 by default]

Now edit squid configuration files by using following command

nano /etc/squid/squid.conf

Remove all lines and paste the following data 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97

  1. SQUID 2.7/ Nginx TEST CONFIG FILE
  2. Email: aacable@hotmail.com
  3. Web : http://aacable.wordpress.com
  4. PORT and Transparent Option

http_port 8080 transparent server_http11 on icp_port 0

  1. Cache is set to 5GB in this example (zaib)

store_dir_select_algorithm round-robin cache_dir aufs /cache1 5000 16 256 cache_replacement_policy heap LFUDA memory_replacement_policy heap LFUDA

  1. If you want to enable DATE time n SQUID Logs,use following

emulate_httpd_log on logformat squid %tl %6tr %>a %Ss/%03Hs %<st %rm %ru %un %Sh/%<A %mt log_fqdn off

  1. How much days to keep users access web logs
  2. You need to rotate your log files with a cron job. For example:
  3. 0 0 * * * /usr/local/squid/bin/squid -k rotate

logfile_rotate 14 debug_options ALL,1 cache_access_log /var/log/squid/access.log cache_log /var/log/squid/cache.log cache_store_log /var/log/squid/store.log

  1. [zaib] I used DNSAMSQ service for fast dns resolving
  2. so install by using "apt-get install dnsmasq" first

dns_nameservers 127.0.0.1 221.132.112.8

  1. ACL Section

acl all src 0.0.0.0/0.0.0.0 acl manager proto cache_object acl localhost src 127.0.0.1/255.255.255.255 acl to_localhost dst 127.0.0.0/8 acl SSL_ports port 443 563 # https, snews acl SSL_ports port 873 # rsync acl Safe_ports port 80 # http acl Safe_ports port 21 # ftp acl Safe_ports port 443 563 # https, snews acl Safe_ports port 70 # gopher acl Safe_ports port 210 # wais acl Safe_ports port 1025-65535 # unregistered ports acl Safe_ports port 280 # http-mgmt acl Safe_ports port 488 # gss-http acl Safe_ports port 591 # filemaker acl Safe_ports port 777 # multiling http acl Safe_ports port 631 # cups acl Safe_ports port 873 # rsync acl Safe_ports port 901 # SWAT acl purge method PURGE acl CONNECT method CONNECT http_access allow manager localhost http_access deny manager http_access allow purge localhost http_access deny purge http_access deny !Safe_ports http_access deny CONNECT !SSL_ports http_access allow localhost http_access allow all http_reply_access allow all icp_access allow all

  1. [zaib]I used UBUNTU so user is proxy, in FEDORA you may use use squid

cache_effective_user proxy cache_effective_group proxy cache_mgr aacable@hotmail.com visible_hostname proxy.aacable.net unique_hostname aacable@hotmail.com

cache_mem 8 MB minimum_object_size 0 bytes maximum_object_size 100 MB maximum_object_size_in_memory 128 KB

refresh_pattern ^ftp: 1440 20% 10080 refresh_pattern ^gopher: 1440 0% 1440 refresh_pattern -i (/cgi-bin/|\?) 0 0% 0 refresh_pattern (Release|Packages(.gz)*)$ 0 20% 2880 refresh_pattern . 0 50% 4320 acl apache rep_header Server ^Apache broken_vary_encoding allow apache

  1. Youtube Cache Section [zaib]

url_rewrite_program /etc/nginx/nginx.rb url_rewrite_host_header off acl youtube_videos url_regex -i ^http://[^/]+\.youtube\.com/videoplayback\? acl range_request req_header Range . acl begin_param url_regex -i [?&]begin= acl id_param url_regex -i [?&]id= acl itag_param url_regex -i [?&]itag= acl sver3_param url_regex -i [?&]sver=3 cache_peer 127.0.0.1 parent 8081 0 proxy-only no-query connect-timeout=10 cache_peer_access 127.0.0.1 allow youtube_videos id_param itag_param sver3_param !begin_param !range_request cache_peer_access 127.0.0.1 deny all

Save & Exit. 4) Install Nginx

Now install Nginix by apt-get install nginx

Now edit its config file by using following command nano /etc/nginx/nginx.conf

Remove all lines and paste the following data 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57

  1. This config file is not written by me,
  2. My Email address is inserted Just for tracking purposes
  3. For more info, visit http://code.google.com/p/youtube-cache/
  4. Syed Jahanzaib / aacable [at] hotmail.com

user www-data; worker_processes 4; pid /var/run/nginx.pid; events { worker_connections 768; } http { sendfile on; tcp_nopush on; tcp_nodelay on; keepalive_timeout 65; types_hash_max_size 2048; include /etc/nginx/mime.types; default_type application/octet-stream; access_log /var/log/nginx/access.log; error_log /var/log/nginx/error.log; gzip on; gzip_static on; gzip_comp_level 6; gzip_disable .msie6.; gzip_vary on; gzip_types text/plain text/css text/xml text/javascript application/json application/x-javascript application/xml application/xml+rss; gzip_proxied expired no-cache no-store private auth; gzip_buffers 16 8k; gzip_http_version 1.1; include /etc/nginx/conf.d/*.conf; include /etc/nginx/sites-enabled/*;

  1. starting youtube section

server { listen 127.0.0.1:8081; location / { root /usr/local/www/nginx_cache/files;

  1. try_files "/id=$arg_id.itag=$arg_itag" @proxy_youtube; # Old one
  2. try_files "$uri" "/id=$arg_id.itag=$arg_itag.flv" "/id=$arg_id-range=$arg_range.itag=$arg_itag.flv" @proxy_youtube; #old2

try_files "/id=$arg_id.itag=$arg_itag.range=$arg_range.algo=$arg_algorithm" @proxy_youtube; } location @proxy_youtube { resolver 221.132.112.8; proxy_pass http://$host$request_uri; proxy_temp_path "/usr/local/www/nginx_cache/tmp";

  1. proxy_store "/usr/local/www/nginx_cache/files/id=$arg_id.itag=$arg_itag"; # Old 1

proxy_store "/usr/local/www/nginx_cache/files/id=$arg_id.itag=$arg_itag.range=$arg_range.algo=$arg_algorithm"; proxy_ignore_client_abort off; proxy_method GET; proxy_set_header X-YouTube-Cache "aacable@hotmail.com"; proxy_set_header Accept "video/*"; proxy_set_header User-Agent "YouTube Cacher (nginx)"; proxy_set_header Accept-Encoding ""; proxy_set_header Accept-Language ""; proxy_set_header Accept-Charset ""; proxy_set_header Cache-Control "";} } }

Save & Exit.

Now Create directories to hold cache files

mkdir /usr/local/www mkdir /usr/local/www/nginx_cache mkdir /usr/local/www/nginx_cache/tmp mkdir /usr/local/www/nginx_cache/files chown www-data /usr/local/www/nginx_cache/files/ -Rf

Now create nginx .rb file

touch /etc/nginx/nginx.rb chmod 755 /etc/nginx/nginx.rb nano /etc/nginx/nginx.rb

Paste the following data in this newly created file 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51

  1. !/usr/bin/env ruby1.8
  2. This script is not written by me,
  3. My Email address is inserted Just for tracking purposes
  4. For more info, visit http://code.google.com/p/youtube-cache/
  5. Syed Jahanzaib / aacable [at] hotmail.com
  6. url_rewrite_program <path>/nginx.rb
  7. url_rewrite_host_header off

require "syslog" require "base64"

class SquidRequest attr_accessor :url, :user attr_reader :client_ip, :method

def method=(s) @method = s.downcase end

def client_ip=(s) @client_ip = s.split('/').first end end

def read_requests

  1. URL <SP> client_ip "/" fqdn <SP> user <SP> method [<SP> kvpairs]<NL>

STDIN.each_line do |ln| r = SquidRequest.new r.url, r.client_ip, r.user, r.method, *dummy = ln.rstrip.split(' ') (STDOUT << "#{yield r}\n").flush end end

def log(msg) Syslog.log(Syslog::LOG_ERR, "%s", msg) end

def main Syslog.open('nginx.rb', Syslog::LOG_PID) log("Started")

read_requests do |r| if r.method == 'get' && r.url !~ /[?&]begin=/ && r.url =~ %r{\Ahttp://[^/]+\.youtube\.com/(videoplayback\?.*)\z} log("YouTube Video [#{r.url}].") "http://127.0.0.1:8081/#{$1}" else r.url end end end main

Save & Exit. 5) Install RUBY

What is RUBY? Ruby is a dynamic, open source programming language with a focus on simplicity and productivity. It has an elegant syntax that is natural to read and easy to write.

Now install RUBY by following command

apt-get install ruby 6) Configure Squid Cache DIR and Permissions

Now create cache dir and assign proper permission to proxy user

mkdir /cache1 chown proxy:proxy /cache1 chmod -R 777 /cache1

Now initialize squid cache directories by

squid -z

You should see Following message

Creating Swap Directories 7) Finally Start/restart SQUID & Nginx

service squid start service nginx restart

Now from test pc, open youtube and play any video, after it download completely, delete the browser cache, and play the same video again, This time it will be served from the cache. You can verify it by monitoring your WAN link utilization while playing the cached file.

Look at the below WAN utilization graph, it was taken while watching the clip which is not in cache

WAN utilization of Proxy, While watching New Clip (Not in cache)

Now Look at the below WAN utilization graph, it was taken while watching the clip which is now in CACHE.

WAN utilization of Proxy, While watching already cached Clip

Playing Video, loaded from the cache chunk by chunk

It will load first chunk from the cache, if the user keep watching the clip, it will load next chunk at the end of first chunk, and will continue to do so.

Video cache files can be found in following locations. /usr/local/www/nginx_cache/files

e.g:

ls -lh /usr/local/www/nginx_cache/files

The above file shows the clip is in 360p quality, and the length of the clip is 5:54 Seconds. itag=34 shows the video quality is 360p.

Credits: Thanks to Mr. Eliezer Croitoru & Mr.Christian Loth & others for there kind guidance.


Referensi