Prevent duplicates from being saved in bash history

BashCommand LineCygwinHistory

Bash Problem Overview


I'm trying to prevent bash from saving duplicate commands to my history. Here's what I've got:

shopt -s histappend
export HISTIGNORE='&:ls:cd ~:cd ..:[bf]g:exit:h:history'
export HISTCONTROL=erasedups
export PROMPT_COMMAND='history -a'

This works fine while I'm logged in and .bash_history is in memory. For example:

$ history
    1 vi .bashrc
    2 vi .alias
    3 cd /cygdrive
    4 cd ~jplemme
    5 vi .bashrc
    6 vi .alias

$ vi .bashrc

$ history
    1 vi .alias
    2 cd /cygdrive
    3 cd ~jplemme
    4 vi .alias
    5 vi .bashrc

$ vi .alias

$ history
    1 cd /cygdrive
    2 cd ~jplemme
    3 vi .bashrc
    4 vi .alias

$ exit

But when I log back in, my history file looks like this:

$ history
    1 vi .bashrc
    2 vi .alias
    3 cd /cygdrive
    4 cd ~jplemme
    5 vi .bashrc
    6 vi .alias
    7 vi .bashrc
    8 vi .alias

What am I doing wrong?

EDIT: Removing the shopt and PROMPT_COMMAND lines from .bashrc does not fix the problem.

Bash Solutions


Solution 1 - Bash

As far as I know, it is not possible to do what you want. I see this as a bug in bash's history processing that could be improved.

export HISTCONTROL=ignoreboth:erasedups   # no duplicate entries
shopt -s histappend                       # append history file
export PROMPT_COMMAND="history -a"        # update histfile after every command

This will keep the in memory history unique, but while it does saves history from multiple sessions into the same file, it doesn't keep the history in the file itself unique. history -a will write the new command to the file unless it's the same as the one immediately before it. It will not do a full de-duplication like the erasedups setting does in memory.

To see this silliness in action, start a new terminal session, examine the history, and you'll see repeated entries, say ls. Now run the ls command, and all the duplicated ls will be removed from the history in memory, leaving only the last one. The in memory history becomes shorter as you run commands that are duplicated in the history file, yet the history file itself continues to grow.

I use my own script to clean up the history file on demand.

# remove duplicates while preserving input order
function dedup {
   awk '! x[$0]++' $@
}

# removes $HISTIGNORE commands from input
function remove_histignore {
   if [ -n "$HISTIGNORE" ]; then
      # replace : with |, then * with .*
      local IGNORE_PAT=`echo "$HISTIGNORE" | sed s/\:/\|/g | sed s/\*/\.\*/g`
      # negated grep removes matches
      grep -vx "$IGNORE_PAT" $@
   else
      cat $@
   fi
}

# clean up the history file by remove duplicates and commands matching
# $HISTIGNORE entries
function history_cleanup {
   local HISTFILE_SRC=~/.bash_history
   local HISTFILE_DST=/tmp/.$USER.bash_history.clean
   if [ -f $HISTFILE_SRC ]; then
      \cp $HISTFILE_SRC $HISTFILE_SRC.backup
      dedup $HISTFILE_SRC | remove_histignore >| $HISTFILE_DST
      \mv $HISTFILE_DST $HISTFILE_SRC
      chmod go-r $HISTFILE_SRC
      history -c
      history -r
   fi
}

I'd love to hear more elegant ways to do this.

Note: the script won't work if you enable timestamp in history via HISTTIMEFORMAT.

Bash can improve the situation by

  1. fix history -a to only write new data if it does not match any history in memory, not just the last one.
  2. de-deduplicate history when files are read if erasedups setting is set . A simple history -w in a new terminal would then clean up the history file instead of the silly script above.

Solution 2 - Bash

The problem is definitely the histappend. Tested and confirmed on my system.

My relevant environment is:

$ set | grep HIST
HISTFILE=/Users/hop/.bash_history
HISTFILESIZE=500
HISTIGNORE=' *:&:?:??'
HISTSIZE=500
$ export HISTCONTROL=erasedups
$ shopt | grep hist
cmdhist         on
histappend      off
histreedit      off
histverify      off
lithist         off

Now that I think about it, the problem is probably with the history -a. history -w should write the current history without any duplicates, so use that if you don't mind the concurrency issues.

Solution 3 - Bash

export HISTCONTROL=ignoreboth

Solution 4 - Bash

Here is what I use..

[vanuganti@ ~]$ grep HIST .alias*
.alias:HISTCONTROL="erasedups"
.alias:HISTSIZE=20000
.alias:HISTIGNORE=ls:ll:"ls -altr":"ls -alt":la:l:pwd:exit:mc:su:df:clear:ps:h:history:"ls -al"
.alias:export HISTCONTROL HISTSIZE HISTIGNORE
[vanuganti@ ~]$ 

and working

[vanuganti@ ~]$ pwd
/Users/XXX
[vanuganti@ ~]$ pwd
/Users/XXX
[vanuganti@ ~]$ history | grep pwd | wc -l
       1

Solution 5 - Bash

inside your .bash_profile add

alias hist="history -a && hist.py"

then put this on your path as hist.py and make it executable

#!/usr/bin/env python

from __future__ import print_function
import os, sys
home = os.getenv("HOME")
if not home :
    sys.exit(1)
lines = open(os.path.join(home, ".bash_history")).readlines()
history = []
for s in lines[:: -1] :
    s = s.rstrip()
    if s not in history :
        history.append(s)
print('\n'.join(history[:: -1]))

now when you want the short list just type hist

Attributions

All content for this solution is sourced from the original question on Stackoverflow.

The content on this page is licensed under the Attribution-ShareAlike 4.0 International (CC BY-SA 4.0) license.

Content TypeOriginal AuthorOriginal Content on Stackoverflow
QuestionJPLemmeView Question on Stackoverflow
Solution 1 - BashraychiView Answer on Stackoverflow
Solution 2 - Bashuser3850View Answer on Stackoverflow
Solution 3 - BashRobert GambleView Answer on Stackoverflow
Solution 4 - BashVenu AnugantiView Answer on Stackoverflow
Solution 5 - BashjserverView Answer on Stackoverflow