Curate your shell history

(esham.io)

118 points | by todsacerdoti 21 hours ago

20 comments

  • 1vuio0pswjnm7 12 hours ago
    "If I type a shell command that's valuable - one that did something useful enough that I might want it again in future, and long and complicated enough that I'd be annoyed to have to figure it out a second time from scratch - then I can't rely on it just happening to be in my .bash_history. So instead I put it somewhere else: maybe a shell function in my .bashrc, or maybe a shell script in my directory of random useful scriptlets."

    I have a folder marked executable with hundreds of small shell scripts, appended to $PATH. I never write long scripts. I sometimes write scripts that run other scripts but I try to avoid such dependencies. I do not use bash, I use NetBSD Almquist or modified Debian Almquist shell so all the scripts are highly portable. I can move a tarball of this folder from computer to computer, whether the OS is Linux or BSD. These scripts generally run faster than bash scripts.

    It would be untenable to save all command line history since I spend every day in the shell in textmode (not x11) and only occasionally switch to a graphical environment. I sometimes use Almquist-based shells that have command line history removed. This forces me to write useful scripts.

    A lot of cumulative shell knowledge is contained in that folder of scripts. These scripts have worked for decades unchanged. I think the Almquist shell may be my favourite interpreter. No matter what else I try I always come back to the shell language. There is something appealing about scripts that seem to work forever.

  • mmh0000 18 hours ago
    I read this article in horror. Even after he said he kept /only/ the last 9800 commands.

      ⟩ history | wc -l
      170919 
    
    I have (deduplicated) shell history with timestamps going back years. A little bit of CTRL+R and `history | grep …` I never have to remember how I did some arcane magic in the past, just that I did.
    • pama 16 hours ago
      Maybe I'm a hoarder. My timestamped histories from four machines in four different locations added up to 898028 lines --- I kept most of my shell commands, including errors and iterative attempts to get a good command, since early 2009 in the servers that matter the most to me. I use named shells with different histories each (roughly one to four shells per project), so the largest single shell history in these logs is only 43424 lines (21712 commands). The average across all shells is a bit over 75 logged commands per day, which doesn't sound as bad as the total line count in these histories. Despite the advent of LLMs I still need a lot of internal esoteric hacks that are relatively easy to find in these histories. Perhaps I should switch to using a centralized db that keeps the outputs as well and use them to finetune LLMs, but the ease of simple unix/Emacs tools operating with standard/simple history files always felt attractive.
    • koolba 15 hours ago
      Same here. Also included the host, timestamp, user, and current directory. Unifying history across multiple machines is very pleasant when you know you did something, but don’t even remember where you did it.
      • remram 12 hours ago
        What tool do you use for that?
        • earnestinger 11 hours ago
          I bet it was “bash” (with sed and rsync)
          • koolba 10 hours ago
            You forgot jq. The shell history is stored as json lines. Makes for easy grepping too.
    • abathur 18 hours ago
      How long does it seem to take to open a new terminal and reach a working shell?

      Your hardware might be fast enough that it's negligible, but several years ago (maybe 2018-19ish?) I noticed my shell startup was getting sluggish and traced a large fraction of it back to the time bash took to load many lines into history.

      I still like keeping it all, so I wrote something to dump it into a database as I go but then truncate working histories to something shorter (500, I think).

      • loeg 18 hours ago
        Instantaneous. bash might just be slow here; I'm using zsh.

          $ time zsh -i -c 'exit'
          zsh -i -c 'exit'  0.03s user 0.02s system 100% cpu 0.055 total
        • hdjrudni 9 hours ago

              time zsh -i -c 'exit'
              zsh -i -c 'exit'  0.33s user 0.31s system 97% cpu 0.658 total
          
          Yeah... I've definitely got something going on here.
        • abathur 17 hours ago
          Are you sure it's loading history in there? My .zsh_history isn't completely empty, and when I run the same with 'history' swapped for 'exit' it doesn't print anything. (But this might have something to do with macOS default shell profile stuff.)
          • bee_rider 16 hours ago
            Actually, this seems like an interesting question. I don’t really see any reason why a shell must load the history file before the user starts typing. I mean, I usually don’t ^r immediately, so to speed things up it could:

            * Load asynchronously

            * Only have a barrier if I reverse-search.

            Might be over-engineered, though.

            • loeg 15 hours ago
              zsh may very well be doing this, especially for any de-duplication index.
          • loeg 17 hours ago
            Not confident, no. But even running an interactive zsh manually and exiting as fast as humanly possible is within the same ballpark, modulo human reaction times.

              $ time zsh
              $ ^D
              zsh  0.14s user 0.04s system 80% cpu 0.219 total
            
            It's just not an obstacle to spawning a new shell and using it.
            • AdieuToLogic 16 hours ago
              A way to ensure the zsh invocation behaves as a typical interactive shell is:

                time zsh --login -c 'logout'
              
              Note use of logout instead of exit. In this context, logout ensures whatever combination of flags used results in a login shell.

              See zshbuiltins(1).

              • loeg 15 hours ago
                On par with the first time.

                  $ time zsh --login -c 'logout'
                  zsh --login -c 'logout'  0.03s user 0.01s system 99% cpu 0.037 total
                
                Whenever I've had noticeably slow zsh startup times in the past, it was almost always some plugin/extension doing something very dumb (e.g. stuff like full 'git status' in a large repo -- just takes time); not the history management.
                • AdieuToLogic 10 hours ago
                  > On par with the first time.

                  This was my experience as well. The --login flag was recommended in order to address concerns raised by @abathur.

                  > Whenever I've had noticeably slow zsh startup times in the past, it was almost always some plugin/extension doing something very dumb (e.g. stuff like full 'git status' in a large repo -- just takes time); not the history management.

                  Great point.

                  Unfortunately, my limited research suggests tracking down which plugin or extension is the root cause is a manual effort starting with the contents of the canonical zsh initialization files (often named .zlogin, .zprofile, .zshenv, and .zshrc).

                • tough 14 hours ago
                  nvm plugin used to be really bad!

                  i think you can debug-trace loading time putting some commands on your zshconfig

            • bee_rider 16 hours ago
              Is .14s noticeable? That’s more than a couple frames at 60fps.

              I mean, this might seem really bizarre to worry about, but some folks use tiling window managers and just expect to immediately start typing once the “new terminal” key is hit…

              Actually this has been slightly annoying me lately after making my system pretty. I added some fancy compositor stuff and now if I do “open terminal,” “exit” the computer complains that it doesn’t know what xit means. Not a big enough problem to fix though.

              Edit: huh, actually playing with it this seems to be 99% the fault of the terminal emulator anyway, not the shell or my silly special effects.

              • AdieuToLogic 10 hours ago
                > Is .14s noticeable?

                Unless the system is under memory pressure, most shell initialization will read from in-memory OS file caches and not be noticeable as you note.

                Where significant delays are often seen is when a seemingly innocuous extension uses network-based or some other heavy file system I/O commands (such as a "find $HOME -type f" type of thing).

              • umbra07 12 hours ago
                This is why I switched to `fish`. Customizing zsh to achieve feature parity with fish (along with the tide prompt) made zsh veryyyy slow. It's entirely possible that there's some sort of optimized loader for zsh out there that ameliorates this, but I just couldn't be bothered.
              • loeg 15 hours ago
                The 0.14s includes me noticing zsh has started and typing ^D.
      • o11c 15 hours ago
        The only time I've ever had shell startup time be significant is when loading fancy shell rc stuff (oh-my-zsh is the most infamous but there are many others)

        Unfortunately I've hit some obscure bug with exactly when `HISTFILESIZE` is applied, and had some shells truncate all my history undesirably. There are also problems with reading history if the timestamp format changes.

      • mzs 17 hours ago
        Just put them in files, like month per machine/mpount per shell (bash, csh, etc). Then the shell will only load the current month's history. You can use grep of the files and easily narrow down what to search in.
    • loeg 18 hours ago
      Yeah. On one host:

        wc -l .zhistory
        121104 .zhistory
  • belden 17 hours ago
    I fall into the hoard-and-curate camp.

    I use bash within tmux heavily, and got irked that a command I run in one shell session is not immediately available as a history item in other concurrent shell sessions. So I wrote a history plugin based on bash-preexec to track everything to two files: a per-directory history file, and a global history file.

    I have a bash function which does history selection for me, by popping an fzf selector to look at the directory-specific history file. A keybinding within fzf allows me to switch to looking at the global history file instead.

    Boring commands such as “cd”, “pushd”, and a few others don’t get logged. The log entries are in json format and include basic metadata; directory, timestamp, and pid.

    Within the fzf history picker, another keybinding allows me to edit whichever file I’m actively using. So if I fumble a few times to construct a command, then when I get it right I just pop into the selector, edit the relevant file, and remove the lines I don’t want to misfire on again.

    I’m sure this is basically what atuin does; now that I’m at the spot where directories are the unit of history relevant history, maybe I should give that tool another look.

    One really interesting upside of all of this is that I now tend to make “activity-specific” directories in my repos. For example, I have a “.deploy” directory at the git root of most of my projects. There are no files within that directory; but my tool creates ~/.bash.d/history/home/belden/github/company/project/.deploy.json which contains the history of ~/github/company/project/.deploy/

    Empty directories are invisible to git, but for me the directory “has” content: the log of how I need to deploy this service or that service.

    It’s a weird way to use my shell, and just sprung out of the initial grief: I shouldn’t have to exit a shell session to have its history become available.

    • rane 15 hours ago
      > command I run in one shell session is not immediately available as a history item in other concurrent shell sessions

      It's hard to understand how any other way is usable. If you have a dozen concurrent shell sessions, some running a long running process for example, and you had to restart one of them, you wouldn't be able to just Ctrl-C and go to the previous command. For me it's way better shell sessions live their own life but the history is accumulated to a single pool.

  • epistasis 17 hours ago
    I definitely want to keep every single shell command I ever executed, including mistakes. And I want to track across machines, projects, etc.

    Lately I have been using atuin for this. I have some gripes about interface defaults, and how the interactive search for history wipes out most of what was on my terminal view and causes a big "wtf was I doing again?" moment when it changes the terminal so much, but the tradeoff of shared history is very very much worth it for me.

    • wonger_ 14 hours ago
      Why keep the mistakes? Just curious.
      • tough 14 hours ago
        So you don't make them again?

        AI Agents using tools also benefit from seeing past usage of the tools (what worked, what didnt) to help them inform future usage

        • mattrighetti 13 hours ago
          It’s difficult to distinguish whether the command failed because of a typo or because the program it launched crashed.

          I was thinking about this the other day and I was specifically trying to think of a way to avoid the command typos ending up in the history file. I don’t think it’s useful to keep those around.

      • epistasis 12 hours ago
        Mistakes often have side effects, and sometimes I don't notice it until later. It's good to come back to a project after a week and try to see where unintentional effects may have come from.
  • hiAndrewQuinn 17 hours ago
    >My unusual habit is: turn off the history file completely, by putting the command ‘unset HISTFILE’ in my .bashrc. [...] All the shell history I allow myself is localised and short-term.

    I don't do this, but I do disable my browser history on my work laptop so that I'm forced to author things I, and presumably other people, can actually stumble their way to later on.

    https://hiandrewquinn.github.io/til-site/posts/disable-your-...

  • c0l0 19 hours ago
    I've been doing this for years, and my shell history is usually only a few lines/commands long when I start a new session. I like it that way. My browser's roughly the same - I always start firefox in "private browsing mode" (my system's default browser application is a shell wrapper around firefox that takes care of that), and I very consciouly use a non-private instance to "soft-bookmark" stuff that gets committed to my browser history. Actual bookmarks are for all the stuff that I consider really important not to forget about. A few sites (such as HN) have the privilege of me visiting them in non-private browsing mode while authenticated via cookes.

    People might find that weird, and it sure is a tad inconvenient when random, low-priority websites cosplay Internet Fort Knox with very short and annoying MFA login timeouts and methods, but for me, the benefits (I'm less susceptible to tracking and can click on stuff with less angst that I might somehow leak data from my nigh-eternal browser session to some site or another, and I get to start relatively fresh in each session) outweigh the costs.

    • 127dot1 18 hours ago
      Nowadays EVERY site pretends being Fort Knox. Everyone's got a captcha for you.
  • jsphweid 19 hours ago
    Where I work someone wrote a system to allow one to save every command they ever typed and make it available for searching via cli or web app. I opted in. It's probably one of the most useful tools I've ever used.
    • kriro 6 hours ago
      How many credentials get exposed this way? That would be my main concern.
    • ethersteeds 7 hours ago
      Centrally? Can you browse others'?
  • Cockbrand 19 hours ago
    I've been following a similar approach for a while. In bash, I have `HISTCONTROL=IGNOREBOTH` activated (or `setopt hist_ignore_space hist_ignore_dups` in zsh - most shells have an equivalent setting), and I just prepend most commands with a space. So only the more or less important stuff ends up in .bash_history or .zsh_history. This has the added nice effect that I don't accidentally trigger something destructive when I browse through my history.

    I also spend most of my online life in incognito windows, at least for sites that don't absolutely require a login. This keeps my browser history clean from all the disposable pages that I only visit once, and I take care to do only the more meaningful stuff in a regular browser window.

    • alecco 2 hours ago

        HISTCONTROL=ignoreboth
        HISTIGNORE='rm *:ls *:cd *:cp *:builtin *'
      
      The history file has ALL the lines prefixed by space. I curate it a lot, with most frequent/recent commands near the end. And I add a lot of comments to the file so things are easily searched. And search for things, for example /rsync to find the rsync quick backup snapshot lines so first comes one with -n to check what it would do, then if it looks OK I press 'n' (vi mode) and get the same line without -n, or another 'n' and I get the same without -n and with --delete.

          (cd ~ && [ -d ~/backup_mnt/ ] && rsync    -xav --delete --exclude='backup_mnt' --exclude='.debug' --exclude='.cache' "$HOME" ~/backup_mnt/backup )   # rsync DELETE
          (cd ~ && [ -d ~/backup_mnt/ ] && rsync    -xav          --exclude='backup_mnt' --exclude='.debug' --exclude='.cache' "$HOME" ~/backup_mnt/backup )   # rsync NO DELETE
          (cd ~ && [ -d ~/backup_mnt/ ] && rsync -n -xav          --exclude='backup_mnt' --exclude='.debug' --exclude='.cache' "$HOME" ~/backup_mnt/backup )   # rsync LIST
      
      For very long lines I always explain it in a comment.

      When trying to modify a command, I remove the space, and when done I search for saving history and editing with '/_hi':

          history -a ; vi ~/.bash_history && history -r
      
      and voila.

      I also keep the similar lines aligned with spacing to note the differences quickly. (e.g. the -n above)

      When exiting a shell and there's nothing interesting, history -r; exit

      Also always I go to the end of the file before quitting vi so next time I'm where it last ended and before the appended new commands.

      It takes work upfront and discipline but the everyday use of shell becomes very fast and clean.

    • Rediscover 19 hours ago
      Two prehensile thumbs up - your HISTCONTROL or

      HISTIGNORE='m:??:info STARHERE:info:[bf]g:exit:[bf]g %[0-9]:help STARHERE:date:cal:cal ????:exec env ENV\STARHERE' (replace STARHERE with an asterisk)

      generally works for me in bash(1).

      • remram 12 hours ago

          HISTIGNORE='m:??:info *:info:[bf]g:exit:[bf]g %[0-9]:help *:date:cal:cal ????:exec env ENV\*'
      • belden 17 hours ago
        Oh crazy, I didn’t know about this aspect of bash.

        The line above set up automatic history ignoring for the colon-delimited shell globs; eg

        fg %3

        won’t be recorded to shell history, since it’s matched by one of the globs.

  • wswope 18 hours ago
    Obligatory plug for Atuin, which is a Sqlite-based shell history tool. It logs shell commands alongside timestamps, the working directory, and the return value.

    You can optionally sync your history to a server in encrypted form to keep a shared history across hosts. The server is extremely easy to self host.

    As discussed in TFA, it’s easy to filter or scrub junk like `cd ~/Desktop` so it doesn’t pollute your history. You can also fuzzy search and toggle between commands run in your current session, commands run on your current host, commands run in your CWD, and commands run on all hosts.

    It’s my single favorite piece of dev tooling, and has made my job + life far smoother and easier. Highly recommend.

    https://atuin.sh/

    • mediumsmart 3 hours ago
      Atuin sounds great so I asked my llm if I should install it and it said: You already have a solid Bash+JSON+fzf setup that works and is flexible.

      considering to ask where my json bash history is located and how that works - I noticed Ctrl-r looked different a while ago.

    • nullwarp 18 hours ago
      This is the absolute first thing I ever install now. The amount of time it saves me is immeasurable.

      There's no way I'm going to manually curate every command I type, I've got actual work to do.

      • Too 5 hours ago
        What are the most useful features?
    • calmbonsai 15 hours ago
      Ooooh, I'll have to take it for a test drive next week. This looks good for personal use as well as keeping a fleet of production maintenance/monitoring instances sync'd.
    • acedTrex 17 hours ago
      An absolute requirement for me. I also support it as a sponsor its such a useful project.
  • csmattryder 19 hours ago
    I use fish with the sponge plugin that omits any command that doesn't return an error code 0. Combined with ones like `z` to jump around directories, I reckon my history is pretty tidy.

    Anything important and convoluted, I'll shove it in the fish config for quick access.

  • omoikane 17 hours ago
    > unset HISTFILE

    I don't do this because I didn't know "unset HISTFILE" was an option. Instead I symlink .bash_history (and other history things like .python_history, lesshst, etc.) to /dev/null

  • Xx_crazy420_xX 9 hours ago
    The preference of shell history depends on the use case. In my case it's ever changing projects with different tech stacks that i need to remember set of commands to get SSO to get to container registry etc..

    The solution for me was storing bash history separately for each vscode project - https://gist.github.com/Srakai/9f3788cdb07259d65d335ff150c38...

    Scripts are cool, but imagine ever evolving script that is controlled by arrows and enter

  • cobertos 19 hours ago
    I like keeping _all_ my shell history, with timestamps. Even the fuckups. It's come in handy a few times now

    * Looking back at my old process for onboarding into some new technology. I can easily see how my technique evolved as I optimized for more correctness (e.g. cp -r vs cp -rp vs rsync with whatever flags does everything I want, I know I want the most recent incantation)

    * Figure out the exact process I did for a backup or other movement of data when I want to ensure I have copied X, Y, and Z. Sometimes my shell history has saved me where my notes have failed.

    • jhgorrell 18 hours ago
      I have most of my shell history back to 2005(?). Each terminal gets its own new history file.

      99% of the time, I never look at it, but when I do need to look at it, it has been great. My boss once asked me: "What args and screening file did we use when we made that one-off DB 4 months ago?" Was able to check and confirm it was correct. Or for personal use: "Where did I move that folder of pictures?"

      • PeterWhittaker 17 hours ago
        I opted for a single history across all sessions on any given host: On my main machine, the first of 54,434 entries is timestamped 2020-08-22:12:39, while on the machine on which I do most development at the moment (it varies from product to product and release to release), the first of 34,771 entries is timestamped 2023-05-08:11:34.

        For the curious, the salient .bashrc bits are:

          function _setAndReloadHistory {
              builtin history -a
              builtin history -c
              builtin history -r
          }
          
          # preserve shell history
          set -o history
          # preserve multiline commands...
          shopt -s cmdhist
          # preserve multiline command as literally as possible
          shopt -s lithist
          # reedit failed history substitutions
          shopt -s histreedit
          # enforce careful mode... we'll see how this goes
          shopt -s histverify
          
          # timestamp all history entries
          HISTTIMEFORMAT='%Y-%m-%d:%H:%M '
          
          # not the default, we like to be explicit that we are not using defaults
          HISTFILE=~/.bash_eternal_history
          # preserve all history, forever
          HISTSIZE=-1
          # preserve all history, forever, on disk
          HISTFILESIZE=-1
          # record only one instance of a command repeated after itself
          HISTCONTROL=ignoredups
          
          # preserve history across/between shell sessions...
          # ...when the shell exits...
          shopt -s histappend
          # ...and after every command...
          PROMPT_COMMAND="${PROMPT_COMMAND:+$PROMPT_COMMAND; } _setAndReloadHistory"
        
        EDIT: Remembered just after submitting that since I am on MacOS, I ran the command

          touch ~/.bash_sessions_disable
        
        back on August 22nd, 2020, to prevent Terminal from saving per-session information. I've never cleaned out ~/.bash_sessions, suppose I should, but it hasn't been updated since that day.
  • kstrauser 19 hours ago
    I see the benefit, but... that looks like work. From the post:

    > Why save vim /etc/rc.conf when sudo vim /etc/rc.conf is what I meant 100% of the time?

    Good point, but AFAIK all shells search history by recency by default. Whether I search with Fish's up-arrow, or with Atuin's ^r, the first result will be the match I executed most recently. In this case, that means I'd have to type `sudo vim /etc/rc.conf` correctly one time and then that's the first version of the command I'll see next time I look for it. And if it's something I do often enough for it to matter, but still manage to screw it up frequently, I'll turn it into an alias or function.

    This is the kind of busywork that feels like an ADHD tarpit to me. No. I absolutely do not need to optimize my shell history, lest I end up with a beautiful, tiny history file that's free from the detritus that would have come from me spending that time actually doing my job.

    • corytheboyd 19 hours ago
      Agree, the recency sort sort ends up normalizing history over time anyway. Use fzf with your history to see more than one match a time anyway, and you can at least glance through a few of the iterative commands you went through if your history isn’t “normalized” yet. I like keeping shell history dead simple— it’s a stack of executed statements. Period.

      I do pull very often run things into functions and source them into every shell session. Usually only parameterized things that aren’t trivial to use reverse search for.

  • web007 18 hours ago
    Curation is probably a good idea, but keeping context is probably a better idea.

    The referenced "I don't keep history" philosophy is madness. You won't know what thing would have been useful to keep until you need it $later. Sure, you'll write down some good stuff and maybe alias it.

    That's fantastic, do more of that anyway.

    Don't pretend you're going to do that for every trick or gotcha you encounter, and don't think you're going to remember that one-off not-gonna-need-it thing you did last week when you find that bug again.

    My local history is currently 106022 lines, and that's not even my full synchronized copy, just this machine. It's isolated per-session and organized into ~/.history/YYYY/MM/machine_time_session hierarchy. It has 8325 "git status", 4291 "ll", 2403 "cd .." and 97 "date" entries which don't matter. Literal/complete entries, not including variations like "date +%Y%m%d" which are separate. I can ignore them, either by grepping them out or filtering mentally, but something as benign as "cd .." is INCREDIBLY useful to establish context when I'm spelunking through what I did to debug a thing 2 years ago.

    The even better version of both of these variants is to keep everything AND curate out useful stuff. That whole history (4 years locally) is 10MB, and my entire history compressed would be less than a megabyte.

    Edit: just realized who posted this, I overlapped with Tod at my first gig in Silicon Valley. Small world!

    • abathur 18 hours ago
      I think it's a byproduct of how macOS itself saves and restores window state, but the macOS Terminal.app has options to control restoring scrollback on reopen. History is good, but it's really great to be able to scroll back and see more context around the commands I was running. (This can still fail me. In a rarely-touched tab I might have years of scrollback, but I currently have it set to restore 10k lines and I may find there's only a few days or hours in a tab where I ran something noisy with verbose logging...)
  • eitau_1 17 hours ago
    I wish there was a (Jupyter) notebook-like interface in shells so only the final set of commands is saved in the history after a trial-and-error/refinement cycle
  • 127dot1 18 hours ago
    What author does is nuts. But the advice of curating history is sound.
  • add-sub-mul-div 18 hours ago
    SecureCRT can log everything from all sessions and then you have not only command history but the output, psql query results, etc, all in the order/context of the original session. I'd find it hard to live without.
  • hamburglar 15 hours ago
    Obligatory mention of something I like to call “bashtags”, where you just annotate a command line with a descriptive and unique end-of-line comment, eg “#bounce-web”. This combined with Ctrl-r and you just type the bashtag to find it. This has saved me so much time.
  • mock-possum 19 hours ago
    Wouldn’t LLM-driven contextual fuzzy search of your shell history be more useful than manually keeping it clean?

    I need less things to do, not more. Let the robot eat my history and make smart suggestions if it thinks I’m trying to do something similar to something I’m done before. That’d be actually useful to me.

    • kccqzy 19 hours ago
      It would be too slow. LLMs incur too much latency so using LLMs gets you out of the flow. No way to compete against a simple substring or subsequence search.
    • kergonath 18 hours ago
      Why use a LLM when standard fuzzy search does the job, at a fraction of the computational cost?
    • tonymet 18 hours ago
      i like where you are going. even a simple full text search with ranking would work well.

      https://github.com/atuinsh/atuin