Blog Archives
#Linux : extract substring in #Bash
One of the most important tasks in dealing with data is string manipulation. We already saw how to use awk and grep to efficiently sift through text files using command line tools instead of developing ad-hoc code. To step it up a notch, we can also do some heavier preprocessing of the data, such as selecting only the subset of information that matches a particular pattern, to ensure data coming out of our pipeline is of good quality.
In this case, we use a Bash feature called parameter expansion. Let’s assume we have the text data in a variable TEXT_LINE
and an expression pattern
to match (in file-name matching format), this is a summary of the possible expansion:
- Delete shortest match of pattern from the beginning
${TEXT_LINE#pattern}
- Delete longest match of pattern from the beginning
${TEXT_LINE##pattern}
- Delete shortest match of pattern from the end
${TEXT_LINE%pattern}
- Delete longest match of pattern from the end
${TEXT_LINE%%pattern}
- Get substrings based on position using numbers
${TEXT_LINE:START:END}
- Replace particular strings or patterns
${TEXT_LINE/pattern/replace}
So for example, to extract only the file name without the extension:
${TEXT_LINE%.*}
or to extract user name from an email:
${TEXT_LINE%%@*.*}
or extract the file name from an absolute path:
${TEXT_LINE##*/}
NOTE: You can’t combine two operations, instead you have to assign to an intermediate variable.
#Linux : use grep to extract lines around a pattern match
In the text processing toolkit of every sys admin or data scientist out there there is a well known command line utility of which utility is second to none (ok, maybe sed is a strong contender): grep.
So “grepping” has become a known term among developers and it has the same meaning as “googling”, that is find something that matches my query in this file. Grep will return the list of rows matching the specified pattern.
However sometimes it is useful to have some context around the pattern match, especially if we are talking about system logs; in this case grep has a couple of little known flags that are super useful: -A
, for after the pattern match, and -B
for before.
For example to display 10 lines before and 15 lines after any system log line that contains the word error:
#journalctl --no-pager | grep "error" -A 15 -B 10
#MacOsX : Terminal Cheat Sheet
If you are a *nix geek like me you can’t but love the command prompt.
One of the best tool to improve the plain old terminal is an utility called tmux. You can install through Homebrew.
Now, there are many commands to remember to play nicely with the terminal, and sometimes a little remind might be useful, that’s why cheat sheets exist.
Here is mine, enjoy.
#cURL : HOWTO [UPDATED]
You can use the cURL library and the curl
command to design your own Request and explore the Response. There are many possible uses like e.g., API debug, web hacking, pen testing.
curl
is a tool to transfer data from or to a server, using one of the supported protocols (e.g., FTP, GOPHER, HTTP, HTTPS, IMAP, LDAP, POP3, RTMP, SCP, SFTP, SMTP, TELNET). The command is designed to work without user interaction.
curl
offers a busload of useful tricks like proxy support, user authentication, FTP upload, HTTP post, SSL connections, cookies, file transfer resume, Metalink, and more. As you will see below, the number of features will make your head spin!
So curl
is a truly powerful command, however it does at the cost of complexity. Here I will show some real-world use cases.
URL
The URL syntax is protocol-dependent. If you specify URL without protocol://
prefix, curl
will attempt to guess what protocol you might want. It will then default to HTTP but try other protocols based on often-used host name prefixes. For example, for host names starting with “ftp.” curl will assume you want to speak FTP.
You can specify multiple URLs or parts of URLs by writing part sets within braces as in:
curl en.wikipedia.org/wiki/{FTP,SCP,TELNET}
or you can get sequences of alphanumeric series by using [ ]
as in:
curl forums.macrumors.com/showthread.php?t=[1673700-1673713]
curl numericals.com/file[1-100].txt
curl numericals.com/file[001-100].txt
curl letters.com/file[a-z].txt
Nested sequences are not supported, but you can use several ones next to each other:
curl any.org/archive[1996-1999]/vol[1-4]/part{a,b,c}.html
You can specify any amount of URLs on the command line. They will be fetched in a sequential manner in the specified order.
You can specify a step counter for the ranges to get every Nth number or letter:
curl numericals.com/file[1-100:10].txt
curl letters.com/file[a-z:2].txt
Trace Dump
In order to analyze in depth what we send and receive we might save everything on a file, this is as easy as:
curl --trace-ascii DebugDump.txt URL
Save To Disk
If you want save the Response to disk you can use option -o <file>
. If you are using {}
or []
to fetch multiple documents, you can use ‘#
‘ followed by a number in the specifier. That variable will be replaced with the current string for the URL being fetched. Remember to protect the URL from shell by adding quotes if you receive the error message internal error: invalid pattern type (0)
. Examples:
curl 'en.wikipedia.org/{FTP,TFTP,SFTP}' -o "#1.html"
curl arxiv.org/pdf/13[01-11].36[00-75].pdf -o "arXiv13#1.36#2.pdf"
Option -O
writes output to a local file named like the remote file we get (only the file part of the remote file is used, the path is cut off). The remote file name to use for saving is extracted from the given URL, nothing else. Consequentially, the file will be saved in the current working directory. If you want the file saved in a different directory, make sure you change current working directory before you invoke curl:
curl -O arxiv.org/pdf/1301.3600.pdf
Only the file part of the remote file is used, the path is cut off, thus the file will be saved as 1301.3600.pdf
.
Set HTTP Request Method
The curl
default HTTP method, GET, can be set to any method you would like using the -X <command>
option. The usual suspects POST, PUT, DELETE, and even custom methods, can be specified:
curl -X POST echo.httpkit.com
Normally you don’t need this option. All sorts of GET, HEAD, POST and PUT requests are rather invoked by using dedicated command line options.
Forms
Forms are the general way a web site can present a HTML page with fields for
the user to enter data in, and then press some kind of ‘submit’
button to get that data sent to the server. The server then typically uses
the posted data to decide how to act. Like using the entered words to search
in a database, or to add the info in a bug track system, display the entered
address on a map or using the info as a login-prompt verifying that the user
is allowed to see what it is about to see.
Using the -d
option we can specify URL encoded field names and values:
curl -d "prefisso=051" -d "numero=806060" -d "Prosegui=Verifica" -d "form_name=verifica_copertura_ehiveco" http://www.ovus.it/verifica_copertura_ehiveco.php
A very common way for HTML based application to pass state information between pages is to add hidden fields to the forms. Hidden fields are already filled in, they aren’t displayed to the user and they get passed along just as all the other fields. To curl
there is no difference at all, you just need to add it on the command line.
Set Request Headers
Request headers allow clients to provide servers with meta information about things such as authorization, capabilities, and body content-type. OAuth2 uses an Authorization
header to pass access tokens, for example. Custom headers are set in curl using the -H
option:
curl -H "Authorization: OAuth 2c4419d1aabeec" http://echo.httpkit.com
curl -H "Accept: application/json" -H "Authorization: OAuth 2c3455d1aeffc" http://echo.httpkit.com
Note that if you should add a custom header that has the same name as one of the internal ones curl would use, your externally set header will be used instead of the internal one. You should not replace internally set headers without knowing perfectly well what you’re doing. Remove an internal header by giving a replacement without content on the right side of the colon, as in: -H "Host:"
.
If you send the custom header with no-value then its header must be terminated with a semicolon, such as -H "X-Custom-Header;"
to send "X-Custom-Header:"
.
curl
will make sure that each header you add/replace is sent with the proper end-of-line marker, you should thus not add that as a part of the header content: do not add newlines or carriage returns, they will only mess things up for you.
Referer
A HTTP request may include a referer
field (yes it is misspelled), which can be used to tell from which URL the client got to this particular resource. Some programs/scripts check the referer field of requests to verify that this wasn’t arriving from an external site or an unknown page. While this is a stupid way to check something so easily forged, many scripts still do it.
This can also be set with the -H, --header
flag of course. When used with -L, --location
you can append ";auto"
to the --referer
URL to make curl automatically set the previous URL when it follows a Location:
header. The ";auto"
string can be used alone, even if you don’t set an initial --referer
.
curl -e google.com http://echo.httpkit.com
User Agent
To specify the User-Agent string to send to the HTTP server you can use --user-agent
flag. To encode blanks in the string, surround the string with single quote marks. This can also be set with the -H, --header
option of course. Many applications use this information to decide how to display pages. At times, you will see that getting a page with curl will not return the same page that you see when getting the page with your browser. Then you know it is time to set the User Agent field to fool the server into thinking you’re one of those browsers:
curl -A "Mozilla/5.0 (iPhone; U; CPU iPhone OS 4_3_2 like Mac OS X; en-us) AppleWebKit/533.17.9 (KHTML, like Gecko) Version/5.0.2 Mobile/8H7 Safari/6533.18.5" http://echo.httpkit.com
Cookies
The way the web browsers do “client side state control” is by using cookies. Cookies are just names with associated contents. The cookies are sent to the client by the server. The server tells the client for what path and host name it wants the cookie sent back, and it also sends an expiration date and a few more properties.
When a client communicates with a server with a name and path as previously specified in a received cookie, the client sends back the cookies and their contents to the server, unless of course they are expired.
Many applications and servers use this method to connect a series of requests into a single logical session. To be able to use curl in such occasions, we must be able to record and send back cookies the way the web application expects them. The same way browsers deal with them.
It is supposedly the data previously received from the server in a "Set-Cookie:"
line. The data should be in the format "NAME1=VALUE1; NAME2=VALUE2"
.
If no =
symbol is used in the line, it is treated as a filename to use to read previously stored cookie lines from, which should be used in this session if they match. Using this method also activates the “cookie parser” which will make curl record incoming cookies too, which may be handy if you’re using this in combination with the -L, --location
option. The file format of the file to read cookies from should be plain HTTP headers or the Netscape/Mozilla cookie file format. NOTE that the file specified with -b, --cookie
is only used as input. No cookies will be stored in the file. To store cookies, use the -c, --cookie-jar
option or you could even save the HTTP headers to a file using -D, --dump-header
:
curl --cookie "name=whitehatty" http://echo.httpkit.com
curl -c cookies.txt http://www.facebook.com
sed -i '' s/#HttpOnly_\.facebook\.com/echo\.httpkit\.com/g cookies.txt
curl --cookie cookies.txt http://echo.httpkit.com
curl -b cookies.txt --cookie-jar newcookies.txt http://echo.httpkit.com
curl --dump-header headers_and_cookies http://www.facebook.com
Work In Progress…
Ok there are many more options, but I will stop here for now. I will add something in the future, so if you have any request (like using more real urls) just leave a comment.
#MacOsX : Show Hidden Files and Folders
In *NIX systems file and folders beginning with a dot (e.g., .name) are not visible in the Finder (also known as file browser). Since Mac Os X it’s a certified UNIX that’s also the case. If you use the terminal you can use the command:
ls -a
However, most people will use regular Finder. To enable view of hidden files in the Finder use this command:
defaults write com.apple.finder AppleShowAllFiles -bool TRUE
and then restart the finder with the following command:
killall Finder
To revert the changes use the same command, but replace TRUE with FALSE.
#MacOsX : vimrc
If you are looking to configure Vim you find the default configuration file in:
/usr/share/vim/vimrc
Copy and rename it in your home directory:
cp /usr/share/vim/vimrc ~/.vimrc
However it is bare minimal so it is better if you personalize it a bit. One very simple example is the following:
" Configuration file for vim
set modelines=0 " CVE-2007-2438
" Normally we use vim-extensions. If you want true vi-compatibility
" remove change the following statements
" Use Vim defaults instead of 100% vi compatibility
set nocompatible
" more powerful backspacing
set backspace=2
" Display line numbers on the left
set number
" Allow intelligent auto-indenting for each filetype
" and for "plugins that are filetype specific.
filetype indent plugin on
" Fallback when no filetype-specific indenting is enabled
set autoindent
" Enable syntax highlighting
syntax on
" Display the cursor position
set ruler
" Don't write backup file if vim is being called by "crontab -e"
au BufWrite /private/tmp/crontab.* set nowritebackup
" Don't write backup file if vim is being called by "chpass"
au BufWrite /private/etc/pw.* set nowritebackup
#MacOsX : Disable Auto-Save and Versions in Mac OS X
Auto-Save and Versions are excellent features in Mac OS X, but some advanced users are annoyed by them as they often don’t want to save intermediate versions of their work.
Moreover some apps write lots of data on disk (e.g. iMovie and iBooks Author) and this can shorten the life of SSD (look here for more tuning for SSDs).
If you know the name of the app plist you want to disable auto-save and Versions for, you can just plug the name into the defaults write command:
defaults write app-plist ApplePersistence -bool no
If you don’t know it then you can find it with the following command:
osascript -e 'id of application "NAME OF APP"'
Now if you enter the Versions window, auto-save list will be empty and there are no versions to restore to. You’ll probably want turn off File Locking too.
NOTE: some sandboxed apps require another command in addition:
defaults write app-plist AutosavingDelay -int 0
This is expecially true for TextEdit as it is the only Apple app that uses “old-style” autosaving and this causes issues with the sandbox in Lion/Mountain Lion.
NOTE2: It seems that the preference can be set globally but it may cause the login process to become very slow and possibly cause other unexpected behaviour:
defaults write -g ApplePersistence -bool no
#CryptDB : HOWTO Compile on Ubuntu Linux [UPDATE 2]
First, what is CryptDB.
A SHORT PRESENTATION, very useful to understand how it works.
Second, reference system: Ubuntu Linux LTS 12.04.x 32bit 64bit (see this comment).
Third, [NEW] installation:
sudo apt-get udate
sudo apt-get install git ruby
git clone -b public git://g.csail.mit.edu/cryptdb
cd cryptdb
sudo ./scripts/install.rb .
Done. It’s that simple now 😎
If it fails to compile, see THISÂ comment.
If you still do not succeed see THISÂ comment.
With recent version of Ubuntu (14.04 and 16.04) you might need to downgrade Bison, see THIS comment.
[OLD] installation:
- install needed packages:
sudo apt-get install automake bison bzr cmake flex g++ git gtk-doc-tools libaio-dev libbsd-dev libevent-dev libglib2.0-dev libgmp-dev liblua5.1-0-dev libmysqlclient-dev libncurses5-dev libntl-dev libssl-dev
- create a directory, then download software to compile:
mkdir $HOME/cryptdb-inst
cd $HOME/cryptdb-inst
git clone -b public git://g.csail.mit.edu/cryptdb
wget http://es.csail.mit.edu/mysql-5.5.14.tar.gz
bzr branch lp:mysql-proxy - compile mysql-proxy:
cd mysql-proxy
sh ./autogen.sh
./configure --enable-maintainer-mode --with-lua=lua5.1
make
sudo make install - build CryptDB on MySQL:
cd $HOME/cryptdb-inst
tar xzf mysql-5.5.14.tar.gz
cp -R cryptdb/parser/mysql_mods/* mysql-5.5.14/
rm mysql-5.5.14/sql/sql_yacc.{cc,h}
cd mysql-5.5.14
mkdir build
cd build
cmake -DWITH_EMBEDDED_SERVER=ON ..
make
sudo make install
cd /usr/local/mysql
sudo chown -R mysql .
sudo cp support-files/my-medium.cnf /etc/my.cnf
sudo scripts/mysql_install_db --user=mysql --basedir=/usr/local/mysql/
sudo /usr/local/mysql/bin/mysqld_safe --lc-messages-dir="/usr/local/mysql/share/english/"
/usr/local/mysql/bin/mysqladmin -u root password 'letmein'
- Build CryptDB:
cd $HOME/cryptdb-inst/cryptdb
cp conf/config.mk.sample conf/config.mk
sed -i'' -e"1s%/home/nickolai/build%$HOME/cryptdb-inst%" conf/config.mk
make
sudo make install
- now, it’s time to read
cryptdb/doc/README
, enjoy! 😉
NOTE1: you should create a user mysql
to run DBMS for security reasons:
sudo groupadd mysql
sudo useradd -r -g mysql mysql
NOTE2: be very careful on each step and you wont fail.
#MacOSX : IP Scanner Pro, Network Scanning for Dummies
You are accustomed to incomprehensible command line tools???
Finally I have the right solution: IP Scanner Pro
It’s all about friendlyness!!! You can ping, wake up, insert into whitelist, etc all the devices found with just one click.
I will show you just an image, you don’t need anything else! 😉
NOTE: I have hidden MAC address.