Understanding ob-shell.org

;; understanding-ob-shell.org

#+title: Understanding ob-shell.el
#+date: October 28, 2023
#+type: post
#+toc: t

=org-babel= enables embedding executable source code within a
document. Babel differs from other "notebook" applications in that the
source code, execution parameters, and results have the same
representation as the rest of the document.  It's all text. Delimiters
surround the source code and distinguish it from non-code text, as
well as provide options for execution.  When run, source code is
isolated from the non-code text, executed in a subprocess, and the
results inserted into the document.  Source code is a "first class
citizen".  Code, prose, and results are interchangeable.

=ob-shell= defines functions and variables that manage the execution
of several shell languages.

The codebase has existed since at least November 2009.


- *Sustainability* What changes will ease maintenance so that ob-shell
  and org-babel remain assets to the community?
- *Stablity* How do provide?
- *Consistency* My impression is that a high level consideration of
  shell evaluation hasn't happened in at least a decade.  Instead, my
  impression is that the library has progressed by piecemeal fixes
  resulting local optimizations.  My hypothesis is that this is a
  major source of edge cases.

This is no trivial matter.  It is a classic case of Chesterton's

"In the matter of reforming things, as distinct from deforming them,
there is one plain and simple principle; a principle which will
probably be called a paradox. There exists in such a case a certain
institution or law; let us say, for the sake of simplicity, a fence or
gate erected across a road. The more modern type of reformer goes
gaily up to it and says, 'I don't see the use of this; let us clear it
away.' To which the more intelligent type of reformer will do well to
answer: 'If you don't see the use of it, I certainly won't let you
clear it away. Go away and think. Then, when you can come back and
tell me that you do see the use of it, I may allow you to destroy it."

This document is a record of "going away and thinking."  It presents
an understanding of why things are the way they currently are, where
"current" means Org git commit [[https://git.savannah.gnu.o9183e3c72rg/cgit/emacs/org-mode.git/commit/?id=9183e3c723b812360d1042196416d521db590e9f][9183e3c72]].  Each section of code is
analyzed and recommendations are given for improvement.

* Meta
** Setup
Checkout required commit of the Org source.

#+begin_src sh :async t :session *understanding-ob-shell* :exports both :eval yes
if [ -d "/tmp/org-mode" ]; then rm -rf "/tmp/org-mode"; fi

cd /tmp
git clone https://git.savannah.gnu.org/git/emacs/org-mode.git
cd org-mode
git -c advice.detachedHead=false checkout 9183e3c723b812360d1042196416d521db590e9f

: org_babel_sh_prompt> org_babel_sh_prompt> org_babel_sh_prompt> Cloning into 'org-mode'...
: remote: Counting objects: 143862, done.
: remote: Compressing objects: 100% (30960/30960), done.
: remote: Total 143862 (delta 112762), reused 143860 (delta 112760)
: Receiving objects: 100% (143862/143862), 100.25 MiB | 2.68 MiB/s, done.
: Resolving deltas: 100% (112762/112762), done.
: org_babel_sh_prompt> HEAD is now at 9183e3c72 * lisp/org-crypt.el: Fix checkdoc warnings

#+begin_src emacs-lisp :results none :tangle no
;; we want xref to navigate within the checkout directory, not within
;; the currently loaded Org mode (which is not the checkout).

;; generate TAGS file
 #'(lambda ()
     "Basically calls 'ctags -eR [DIR]' on the given directory."
     (xc/build-tags "/tmp/org-mode/" nil nil t)))

;; (visit-tags-table "/data/data/org.gnu.emacs/files/org-mode/TAGS")
(visit-tags-table "/tmp/org-mode/TAGS")

;; enable `xref-etags-mode' globally
(add-hook 'xref-after-jump-hook (lambda ()
                                  ;; (if (string-match-p "^/org-mode/" (buffer-file-name))
                                  (if (string-match-p "^/tmp/" (buffer-file-name))
                                      (xref-etags-mode 1))))

;; open the file of interest
;; (find-file-other-window "~/org-mode/lisp/ob-shell.el")
(find-file-other-window "/tmp/org-mode/lisp/ob-shell.el")

;;  Bootstrap the TAGS referencing
(with-current-buffer "ob-shell.el"

#+begin_src emacs-lisp :results none :tangle no
(remove-hook 'xref-after-jump-hook (lambda ()
                                  ;; (if (string-match-p "^/org-mode/" (buffer-file-name))
                                     (if (string-match-p "^/tmp/" (buffer-file-name))
                                         (xref-etags-mode 1))))

** Utils
This probably isn't needed.
#+begin_src emacs-lisp :results none :tangle no
(defun xc/hash-text ()
  "Create hash of text.

If region is selected, use that.  Otherwise, check if in Org
src-block.  If so, use the entire source block."
  (let ((text
           (buffer-substring-no-properties (region-beginning) (region-end)))
          ((eq (car (org-element-at-point-no-context)) 'src-block)
           (org-element-property :value (org-element-at-point)))
    (cond (text
           (kill-new (substring (secure-hash 'sha256 text) 0 8))
           (message "%s" (car kill-ring)))
           (message "Invalid region")

** Validation
Run the following block to check if all of the original source is
represented in this document.
#+begin_src emacs-lisp :results none :tangle no
;; tangle all emacs-lisp source blocks
(org-babel-tangle nil "/tmp/ob-shell-review.el" "emacs-lisp")

;; compare tangled sources with original (ignoring whitespace)
(let ((tangled-hash
         (insert-file "/tmp/ob-shell-review.el")
          (string-trim (buffer-substring-no-properties (buffer-end 0) (buffer-end 1))))))
         (insert-file "/tmp/org-mode/lisp/ob-shell.el")
          (string-trim (buffer-substring-no-properties (buffer-end 0) (buffer-end 1)))))))
  (if (string= source-hash tangled-hash)
      (message "Annotated and original MATCH")
    (message "Annotated and original DO NOT MATCH")))

Run the following to manually compare the original source and the
source from this document.
#+begin_src emacs-lisp :results none :tangle no
;; Manual comparison
(ediff "/tmp/org-mode/lisp/ob-shell.el" "/tmp/ob-shell-review.el")

The original has mixed tabs and spaces.  Run the following to manually
compare the original source with tabs converted to spaces and the
source from this document.
#+begin_src emacs-lisp :results none :tangle no
;; Manual comparison with tabs converted to spaces (checkout contains
;; mixed tabs and spaces)
     (insert-file "/tmp/org-mode/lisp/ob-shell.el")
     (write-file "/tmp/ob-shell-tabs-to-spaces.el")
     (message "Wrote %s" "/tmp/ob-shell-tabs-to-spaces.el"))
   (ediff  "/tmp/ob-shell-tabs-to-spaces.el" "/tmp/ob-shell-review.el"))

The order of definitions as given in the checkout is not good for
comprehension.  This document rearranges definitions in a logical
order.  The validation block below places definitions in the same
order as the checkout.

#+name: validation
#+begin_src emacs-lisp :eval never :noweb yes :tangle "/tmp/ob-shell-review.el"











;;; Helper functions














(provide 'ob-shell)

;;; ob-shell.el ends here


* Org Babel
** How blocks execute to obtain results
Shell blocks have two basic execution models depending on whether it's
a session or not (that is, a persistent environment or a temporary
environment).  In each case, a subprocess call is made to the
corresponding shell program.

For non-sessions, the final lisp call is to =call-process= (via
=process-file=).  This runs the shell in a synchronous process.  Stdin
and stderr are output to a buffer when the process completes.  The
output is scraped, cleaned, and returned for the block results.

For sessions, the final lisp call is to =process-send-string= (via
=comint-send-string=).  When the first session block is run, a process
buffer is created to collect output from a synchronous process (see
[[https://www.gnu.org/software/emacs/manual/html_node/elisp/Output-from-Processes.html][Output from Processes]]).  Block source is sent to the buffer process
along with command to echo delimiters.  Output is filtered until the
ending delimiter appears.  When the ending delimiter is found, output
is scraped, cleaned, and returned for the block results.

*NOTE* Sessions are currently entangled with async!  See [[* "session" and "non-session" vs. "synchronous" and "asynchronous"]["session" and
"non-session" vs. "synchronous" and "asynchronous"]].

| header  | final lisp call     |
| shebang | call-process        |
| cmdline | call-process        |
| stdin   | call-process        |
| session | process-send-string |
| async   | process-send-string |
| default | call-process        |

** "session" and "non-session" vs. "synchronous" and "asynchronous"

  "session" means a shell environment is "persistent."  Each call is
  executed in the same environment.

  "Non-session" means a shell environment is "temporary."  Each call
  is executed in an independent environment.

  "synchronous" means that execution prevents the user from editing
  the document while results are obtained.

  "asynchronous" means that execution does not prevent the user from
  editing the document while results are obtained.

  These concepts are conflated (or entangled) in org-babel.

  + Why can't non-session blocks run asynchronously?
  + Why not make asynchronous the default?

** Framework versus implementation
  Currently, ob-shell and org-babel provide a set of functions and
  variables that implement subprocess calls.  org-babel, along with
  ob-template, imply a framework for creating new integrations with
  different applications.  However, in practice, are those functions
  and variables set up to be composed?  Not really.

  Asynchronous execution was introduced by ob-python and adopted by
  ob-shell.  Why can't ob-C execute asynchronously?  Because the
  implementation is not set up as a framework.

** TODO How async works
** TODO How =org-babel-comint-with-output= works
* ob-shell
** [2/9] Questions
- [ ] How is the word "shell" used within the =ob-shell= API?

  "shell" is the Babel package name.  Babel packages have the form
  =ob-<language>=.  However, "<language>" isn't always one-to-one.
  For example, =ob-C= defines functionality for D and C++.

  #+begin_src shell
  echo "hi"

  It's confusing because it's also possible to call "sh".  However,
  "sh" is a specific shell application, the Bourne shell!
  Unfortunately, "sh" is often linked to bash which confuses this

  The "shell" word within the =ob-shell= source code is used as a
  common form in which specific shell APIs are based.  For example,
  =org-babel-shell-initialize= creates functions of the form
  "org-babel-execute:template" where "template" is a specific shell.
  The specific execute functions call the generic
  =org-babel-execute:shell= function.
- [ ] Should =ob-shell= and =ob-eshell= be separate packages?
- [ ] What is the execution path for inline versus non-inline blocks?

  Looking at =org-babel-default-header-args= (which is apparently only
  for inline blocks) and =org-babel-default-header-args:template=, the
  two seem separate.  Should they be separate?  Could they be
  combined?  How does this affect things like
- [ ] What is "posh"?

  I think it's supposed to mean "Powershell".  I don't use it but I've
  *never* heard it referred to that way, nor does MicroSoft seem to
  use it.  If it's something that's in the code base (and therefore
  something we "support"), then we should use the correct name for it
  (or get rid of it).

  NNNNNNOOOOOO....according to the Worg page, it stands for
  "Policy-compliant Ordinary SHell".

  However, =org-babel-shell-set-prompt-commands= has a comment saying
  it's Powershell.  So which is it?
- [ ] Would it make sense for sessions to have the process buffer
  exist without a prompt?  Would this fix problems we see of prompts
  being returned in the results?

  Such a change would likely require updating
- [X] What is meant by "support"?

  Not all functionality is consistent nor available for some shells.
  This is not necessarily because of technical limitations of the
  shell.  The biggest culprit is probably Powershell.  Obviously, the
  software comes with no warranty or guarantee.

  However, I think it would be nice to provide as consistent behavior
  as possible.  So, I would say "support" means "we'll try to make
  things consistent but no guarantee :)"

- [ ] Is there a better way to handle the "variable quoting"

  1. Convert to string
  2. Quote
  3. Use quoted version to define a shell variable

  This seems reasonable.

- [X] How do (or even do) :hlines and :sep or :separator work with ob-shell?

  These are strewn throughout the code base.  However, none of them
  are documented, nor is their use clear.  They should be fully
  implemented and documented or removed.

  There is *no* mention of the :separator keyword anywhere in the
  manual.  There is mention of a :sep keyword in the documentation for
  Texinfo export.  The string ":sep" appears twice in the ob-shell.el

  2 matches for ":sep" in buffer: ob-shell.el
     211:  (let ((sep (cdr (assq :separator params)))
     241:      (orgtbl-to-generic var  (list :sep (or sep "\t") :fmt echo-var

  These appear to perform the same role: specify how to separate items
  in a table.  However, the syntax is different and (as we shall see)
  the functionality not quite implemented.

  :hlines twice in in the manual.  First, with respect to columnview
  blocks and again with Python blocks.  columnview blocks are not
  source blocks, so while the concept is probably similar, it's not
  clear how it translates (if at all) to source blocks.  Regarding,
  Python, the manual says,

  In-between each table row or below the table headings, sometimes
  results have horizontal lines, which are also known as "hlines".
  The 'hlines' argument with the default 'no' value strips such lines
  from the input table.  For most code, this is desirable, or else
  those 'hline' symbols raise unbound variable errors.  A 'yes'
  accepts such lines, as demonstrated in the following example.

  Adapting the (only) example from the info manual, I can sort of get
  an answer:

  #+NAME: many-cols
  | a | b | c |
  | d | e | f |
  | g | h | i |

  #+BEGIN_SRC sh :var tab=many-cols :hlines yes
  echo $tab

  : a b c hline d e f hline g h i

  #+BEGIN_SRC sh :var tab=many-cols :hlines yes
  echo "$tab"

  | a     | b | c |
  | hline |   |   |
  | d     | e | f |
  | hline |   |   |
  | g     | h | i |

  Looking at the source code and playing around with it more, here is
  an example using :separator and :hlines.  When :hlines is "yes", it
  requires the use of :hline-string to specify what the hline looks
  like, otherwise it defaults to "hline" (as we saw above).

  #+NAME: many-cols
  | a | b | c |
  | d | e | f |
  | g | h | i |

  #+BEGIN_SRC sh :var tab=many-cols :separator -- :hline-string ++ :hlines yes
  echo "$tab"

  | a--b--c |
  | ++      |
  | d--e--f |
  | ++      |
  | g--h--i |

  If the :separator is a pipe, then the result is the usual table:

  #+NAME: many-cols
  | a | b | c |
  | d | e | f |
  | g | h | i |

  #+BEGIN_SRC sh :var tab=many-cols :separator | :hline-string ++ :hlines yes
  echo "$tab"

  | a  | b | c |
  | ++ |   |   |
  | d  | e | f |
  | ++ |   |   |
  | g  | h | i |

  So, we see that the :hlines and :separator keywords produce some
  result.  However, result leaves a lot to desire and it appears like
  the functionality is not fully implemented.  The results are not
  meaningful and :hlines requires an undocumented header argument

- [ ] Do people use :separator outside of Texinfo exports?

** [1/7] Refactor
- [ ] Make all symbols follow the convention of =org-babel-X= where X
  is the specific language.  "shell" is the generic name whereas
  something like "bash" or "sh" refers to a specific shell

  There are three names used:

  + org-babel-shell
  + org-babel-sh
  + ob-shell

  This module was originally called =ob-sh= and later changed to
  =ob-shell=.  When this happened, the meaning of the =sh= changed.
  Before, it meant a generic shell.  "shell" now means a generic
  "shell" changing the meaning of "sh".  "sh" refers now to a specific
  shell (=/bin/sh=) since that's the call that's actually made when
  executing.  Are "sh" functions specific to the "sh" shell or are
  they generic?

  It looks like when I submitted my async changes, I kept the
  =ob-shell= prefix.  My preference is for =ob-shell= since to matches
  the filename.  However, it shouldn't be there if it's not consistent
  with the rest of the file (or babel system).  The problem with
  introducing =ob-shell= as a prefix is that none of the other babel
  files use that convention.

  It looks like =ob-C= uses =org-babel-X= where X is the language.
  This is the only other Babel module to support multiple languages in
  a single library.
- [X] Add file local variable for nameless

  -*- nameless-current-name: "org-babel-shell"; -*-

  This should be handled by a .dirlocals.  It's not proper to have
  third party code referenced in the mainline repo.  The .dirlocals
  should also be local and not part of the repo.
- [ ] Fixed mixed tabs/spaces and remove trailing whitespace

  Both issues make preparing and reviewing commits needlessly complex.
  Executive decision: use spaces and remove trailing whitespace.
  We're not doing embedded work, so there's no point to space saving
  with tabs.  It should be one or the other.  Since the rest of the
  code is indented by spaces and the gnu standard is two spaces, I
  think we should enforce only spaces.
- [ ] Rearrange definitions.  The order of definitions doesn't assist
  the process of learning what this library does.  Aside from the
  entangled =org-babel-shell-initialize= / =org-babel-shell-names=,
  order doesn't matter for execution.  So, we should optimize the
  order for understanding.

  *Helper terms defined after usage* Organization of the =ob-shell=
  hinders understanding. "Helper functions" are defined at the end of
  the file. This makes reading the code difficult; each function is
  defined in terms only introduced later on.

  *Solution: Reorder function declarations*
  1. org-babel-sh-eoe-indicator
  2. org-babel-sh-eoe-output
  3. org-babel-shell-names
  4. org-babel-default-header-args:shell
  5. org-babel-shell-results-defaults-to-output
  6. org-babel-sh-initiate-session
  7. org-babel-sh-var-to-string
  8. org-babel-sh-var-to-sh
  9. org-babel--variable-assignments:bash_assoc
  10. org-babel--variable-assignments:bash_array
  11. org-babel--variable-assignments:sh-generic
  12. org-babel--variable-assignments:bash
  13. org-babel-variable-assignments:shell
  14. org-babel-prep-session:shell
  15. org-babel-load-session:shell
  16. org-babel-sh-strip-weird-long-prompt
  17. org-babel-sh-evaluate
  18. org-babel-execute:shell
  19. org-babel-shell-initialize
- [ ] Remove needless helper functions

  The way the code is split up doesn't seem to really warrant being
  split up. The subdivisions are too small.  Most of the helper
  functions are only used once. If that's the case, is it really
  helpful to break them out separately? My opinion is no.  That might
  be the case if the function names and docstrings helped
  understanding the use and context. In my opinion, they don't. If
  they were instead not broken out and appeared in the context they
  are used, the context would be clear. The role of the function names
  could be satisfied by comments. This may be worth refactoring,
  considering it took me several hours to make sense of things.
- [ ] Implement a logical separation of concerns

  Aside from the useless helper functions, it's not clear why
  functions like =org-babel-execute:shell= and =org-babel-sh-evaluate=
  do exactly what they do.  There's initiating the process, getting
  the inputs, dispatching, and then providing output.  All of this
  appears to split, ad hoc, between (at least) these two functions.
  Review and consider a different break down of functionality.  See
  the individual functions for details on how this might be done.
- [ ] Fix confusion about what "posh" is

- [ ] Remove :sep, :separator, and :hlines code

  First, they are not documented.  I doubt they are, or ever have
  been, used.

  Second, they do not appear to be implemented correctly.  Specifying
  :hlines along with :hline-string produces a result that doesn't
  appear useful.

  One option would be to implement these.  Doing this would require 1)
  understanding the tooling that currently exists and 2) integrating
  it into the shell context.

  Another option is to remove it until someone requests it.

  Given all the problems that exist currently with shells and the
  refactoring that needs to happen to make a consistent framework, I
  would prefer to simply remove this code.

* ob-shell code analysis
** DONE Top matter
No commentary, just a standard Emacs header. Included for validation.

#+name: top-matter
#+begin_src emacs-lisp :eval never :tangle no
;;; ob-shell.el --- Babel Functions for Shell Evaluation -*- lexical-binding: t; -*-

;; Copyright (C) 2009-2023 Free Software Foundation, Inc.

;; Author: Eric Schulte
;; Maintainer: Matthew Trzcinski <matt@excalamus.com>
;; Keywords: literate programming, reproducible research
;; URL: https://orgmode.org

;; This file is part of GNU Emacs.

;; GNU Emacs is free software: you can redistribute it and/or modify
;; it under the terms of the GNU General Public License as published by
;; the Free Software Foundation, either version 3 of the License, or
;; (at your option) any later version.

;; GNU Emacs is distributed in the hope that it will be useful,
;; but WITHOUT ANY WARRANTY; without even the implied warranty of
;; GNU General Public License for more details.

;; You should have received a copy of the GNU General Public License
;; along with GNU Emacs.  If not, see <https://www.gnu.org/licenses/>.

;;; Commentary:

;; Org-Babel support for evaluating shell source code.

;;; Code:

** DONE Requires
#+name: requires
#+begin_src emacs-lisp :eval never :tangle no
(require 'org-macs)

(require 'ob)
(require 'org-macs)
(require 'shell)
(require 'cl-lib)

*** [2/2] Questions
- [X] What are each of the requires for and are they necessary?

  It appears like =ob= and =shell= are the only requires

  Notice that =ob-macs= is given twice.  The first require for
  =org-macs= is likely for =org-assert-version=.  The second is just
  extra as far as I can tell.

  =ob= loads features from other Org Babel files.  In fact, it loads
  =org-macs= and calls =org-assert-version=.  This makes =org-macs=
  even more redundant.

  =shell= is needed for the =shell= command used in

  =cl-lib= appears unused.  It was introduced to =ob-shell.el= in 2016
  with commit 0f7df327, replacing the deprecated =cl= package.  =cl= was
  fully removed from Emacs in version 27.

  #+begin_src sh :var emacs_etc_dir=(identity data-directory)
  grep -r "'cl' package is now officially deprecated" "$emacs_etc_dir" | cut -f 2 -d ":"

  : ** The 'cl' package is now officially deprecated in favor of 'cl-lib'.

  Regarding backwards compatiblity, it appears that =cl= was never used
  (except for, perhaps, a brief period (that I could not find)).

  If =ob-shell.el= relied on =cl=, we would expect it was updated
  appropriately that when =cl= was deprecated.  Since Emacs no longer
  contains the code for =cl=, if =ob-shell.el= required it, then it
  should fail.  Since it doesn't, it seems safe to assume that all =cl=
  calls were updated to use =cl-lib=.  The =cl-lib= uses a "cl-" prefix.
  However, we don't see that anywhere in =ob-shell.el=.  We only see the
  require itself.

  #+begin_src sh :results output
  grep "cl-" /tmp/org-mode/lisp/ob-shell.el

  : (require 'cl-lib)

  It appears that =cl-lib= is not needed.

  Fun fact: the =cl-lib= has only ever been required (never used,
  according to the prefix).

  #+begin_src sh :results output
  cd /tmp/org-mode/
  git grep "cl-" $(git rev-list --all -- lisp/ob-shell.el) -- lisp/ob-shell.el

  1a1f45d2368b97dc6ea206edcf4202a2d2f359cc:lisp/ob-shell.el:(require 'cl-lib)
  c42cdcda4789122c3c8ed477365b9369bdf0af87:lisp/ob-shell.el:(require 'cl-lib)
  6d85f851b3cf47abaf5197fe07bd793b5cf0d5dc:lisp/ob-shell.el:(require 'cl-lib)
  f5467b53ec9be02ceaca8494e58090b3972fe2ac:lisp/ob-shell.el:(require 'cl-lib)
  39de4a1848d12b1be929853bf884ec04e121d9f0:lisp/ob-shell.el:(require 'cl-lib)
  f7aa8c19f5170dbf09538686fb569f9b60acbd6c:lisp/ob-shell.el:(require 'cl-lib)
  ecb62e2e317b1a4b5b8a6c0f111ed7ef18413040:lisp/ob-shell.el:(require 'cl-lib)
  80d1bc63fff8bb0f92ab4dab9c3b534ccb4d4d69:lisp/ob-shell.el:(require 'cl-lib)
  93339de71b3e2aaa4f0cbaf13ed8bcbc3fa448f3:lisp/ob-shell.el:(require 'cl-lib)
  96a402780c0fd06ca015b6a31a96909d3ab11d23:lisp/ob-shell.el:(require 'cl-lib)
  e0815d75457e4a86f4940631729c98e318bc8231:lisp/ob-shell.el:(require 'cl-lib)
  e8ceb4a2cb3e4901320bc0d577f6bf6fab237879:lisp/ob-shell.el:(require 'cl-lib)
  801c93638ab4b6616567035c6438ad229bc14c1d:lisp/ob-shell.el:(require 'cl-lib)
  8151d52574f525fd922aadc43544688004cb7d14:lisp/ob-shell.el:(require 'cl-lib)
  e81a094383a2e06a80e6dfb5ed0ca1ece44026f2:lisp/ob-shell.el:(require 'cl-lib)
  e36c3cc21b8b1471e1f7928a118de693819c3f12:lisp/ob-shell.el:(require 'cl-lib)
  a35d163685908386833a3d549ed110931bf3915a:lisp/ob-shell.el:(require 'cl-lib)
  13d97ee18c3bd23ccd04b21e0e1cd78070874cdb:lisp/ob-shell.el:(require 'cl-lib)
  f9ea6c61ed716184a17a04399f408c9b922dbd53:lisp/ob-shell.el:(require 'cl-lib)
  5a229cbc44e67ac323b7c5fd46a422089838a6f1:lisp/ob-shell.el:(require 'cl-lib)
  99eafe3787e03ac31ab12b3cc28f7832ef8b0987:lisp/ob-shell.el:(require 'cl-lib)
  07c6b11258e4ce83b78a023e63412183cf9c4c9f:lisp/ob-shell.el:(require 'cl-lib)
  612f4db0907e6eb0e332c83704d19eb78943123e:lisp/ob-shell.el:(require 'cl-lib)
  2f53429413983106aec252e18a8b018649e6b1e0:lisp/ob-shell.el:(require 'cl-lib)
  08428fed7813f25afa9c2df9a0553eb95e264bf0:lisp/ob-shell.el:(require 'cl-lib)
  0c02928eb25c3ec867e01adb05b481cddb3992a6:lisp/ob-shell.el:(require 'cl-lib)
  4533d783c23f89232f7078e6c761541180a70126:lisp/ob-shell.el:(require 'cl-lib)
  6a1f6ee1f80cd9b4fe7d813105577a12fac06c38:lisp/ob-shell.el:(require 'cl-lib)
  ff5fc050d36db9d0cfad208e7626320278d26c0a:lisp/ob-shell.el:(require 'cl-lib)
  9c611fd8ac3483576a5c5871f05aecad1afafa5c:lisp/ob-shell.el:(require 'cl-lib)
  f584d37a67c7e199957c040973dd85e9606e9469:lisp/ob-shell.el:(require 'cl-lib)
  dcf179663626872bae88ff36f58ad1c745f96b45:lisp/ob-shell.el:(require 'cl-lib)
  4afb7f747b16ff32086bc3ac937380e774d6e54c:lisp/ob-shell.el:(require 'cl-lib)
  bb035512464fcb7306d7360b3ac2ba2c2f2e8f23:lisp/ob-shell.el:(require 'cl-lib)
  f57df8fc74df1b76aca35bcf0315636b4d3071f3:lisp/ob-shell.el:(require 'cl-lib)
  b289a65be71cc5ec8df393928f84df90787ced55:lisp/ob-shell.el:(require 'cl-lib)
  3e1641ef0aa01ae39f90a3cb532136484de617bb:lisp/ob-shell.el:(require 'cl-lib)
  ff0dcf52a5e258af82f3eaf1f8ec3b7cd022cb6b:lisp/ob-shell.el:(require 'cl-lib)
  b8df40eccc6ce19455f2bdce50f6d6f3f913d544:lisp/ob-shell.el:(require 'cl-lib)
  713f785017e908333caddd244fcc685745e78539:lisp/ob-shell.el:(require 'cl-lib)
  93f90c8412d61930aa367d8ca61e10ff44f4090a:lisp/ob-shell.el:(require 'cl-lib)
  140aacbf2f57e207a33417bb446060de52a4b312:lisp/ob-shell.el:(require 'cl-lib)
  79650ffbbdac17e9dc016571d3a25f7f32737fd6:lisp/ob-shell.el:(require 'cl-lib)
  6b52bc6a2153a8c60d23d0915246a60d3ee37a52:lisp/ob-shell.el:(require 'cl-lib)
  0dc3811a7a3f441564db35ec6b068751222c6544:lisp/ob-shell.el:(require 'cl-lib)
  250304bd2eb2449bb1fccd80b8efb6e25c6aa901:lisp/ob-shell.el:(require 'cl-lib)
  0f7df32711170906a47594cb2a397c6e5d9c46b7:lisp/ob-shell.el:(require 'cl-lib)

- [X] Should comments tell what the requires are for?

  I feel like yes for the =org-assert-version= in =ob=, but no for
  =shell=.  Maybe just no since =ob= is used for things other than

*** [0/2] Refactoring
- [ ] Remove all requires except =ob= and =shell=.
- [ ] Add comment that =ob= should be listed first in order to check

** DONE Function forward declares
#+name: function-forward-declares
#+begin_src emacs-lisp :eval never :tangle no
(declare-function org-babel-comint-in-buffer "ob-comint" (buffer &rest body)
(declare-function org-babel-comint-wait-for-output "ob-comint" (buffer))
(declare-function org-babel-comint-buffer-livep "ob-comint" (buffer))
(declare-function org-babel-comint-with-output "ob-comint" (meta &rest body)
(declare-function orgtbl-to-generic "org-table" (table params))

These declarations exist to satisfy the byte compiler (or to help it
find the definitions of functions defined in other files).

** DONE defvar org-babel-sh-eoe-output
#+name: org-babel-sh-eoe-output
#+begin_src emacs-lisp :eval never :tangle no
(defvar org-babel-sh-eoe-output "org_babel_sh_eoe"
  "String to indicate that evaluation has completed.")

Token used by =org-babel-comint-with-output= (within
=org-babel-sh-evaluate=) to indicate that execution has completed.
Used only by sessions.  See [[* How blocks execute to obtain results][How blocks execute to obtain results]].

*** [1/3] Questions
- [ ] Should this be renamed to indicate that it's only used by
  session?  Maybe update the docstring at least?

- [X] What is "eoe"?

  Probably "end of execution" or "end of evaluation".

- [ ] Why is it called "output"?

  Maybe because this string is what's sent to the shell process
  (output)?  However, that's not entirely true, as
  =org-babel-sh-eoe-indicator= is also sent.  Maybe because it's what
  =org-babel-comint-with-output= searches for in the output?

*** [0/1] Refactor
- [ ] Rename this

  It needs to have "sh" removed.  EOE should be spelled out or a
  different term used.  "output" should be made more clear.

** DONE defvar org-babel-sh-eoe-indicator
#+name: org-babel-sh-eoe-indicator
#+begin_src emacs-lisp :eval never :tangle no
(defvar org-babel-sh-eoe-indicator "echo 'org_babel_sh_eoe'"
  "String to indicate that evaluation has completed.")

Shell command used by =org-babel-comint-with-output= within
=org-babel-sh-evaluate=. It echoes =org-babel-sh-eoe-output= within
the comint, that being the token used to indicate that execution has
completed.  See [[* How blocks execute to obtain results][How blocks execute to obtain results]]. It is used only
with sessions.

,#+begin_src sh :session *example*
echo "hello, world"

sh-5.1$ PROMPT_COMMAND=;PS1="org_babel_sh_prompt> ";PS2=
org_babel_sh_prompt> echo "hello, world"      <------ This is the org-babel-sh-eoe-indicator
echo 'org_babel_sh_eoe'
hello, world
org_babel_sh_prompt> org_babel_sh_eoe

*** [2/3] Questions
- [ ] What would it look like to explicitly define
  =org-babel-sh-eoe-indicator= in terms of =org-babel-sh-eoe-output=?

  If it's just (format "echo '%s'" org-babel-sh-eoe-output), then that
  avoids the (minor) problem of a typo creating a problem.

- [X] Is this string the same for all the supported shells?

  No. Posh (powershell) uses something like "Write-Host".  See

- [X] Does it matter that posh uses a different command to print to
  the prompt?

  Yes, it does if we want to support powershell.  Personally, I'd
  rather support cmd.exe than powershell.

*** [0/2] Refactor
- [ ] Rename this

  Remove "sh" and "eoe"

- [ ] Fix docstring

  This is *not* the "string to indicate that evaluation has
  completed".  It is the command used to call the string used to
  indicate that evaluation has completed.

** PENDING defvar org-babel-sh-prompt
#+name: org-babel-sh-prompt
#+begin_src emacs-lisp :eval never :tangle no
(defvar org-babel-sh-prompt "org_babel_sh_prompt> "
  "String to set prompt in session shell.")

PENDING a full review of =org-babel-comint-with-output=.

Prompt for shell sessions.  This is set during
=org-babel-sh-initiate-session= using the
=org-babel-shell-set-prompt-commands= corresponding to the block

In practice, the prompt is given by the =comint-prompt-regexp= which
looks like:

(concat "^" (regexp-quote org-babel-sh-prompt)     " *")
(concat "^" (regexp-quote "org_babel_sh_prompt> ") " *")
"^org_babel_sh_prompt>  *"

=comint-prompt-regexp= is required by =org-babel-comint-with-output=.

*** [3/4] Questions
- [X] Why do we set the prompt?

  Because of how =org-babel-comint-with-output= uses

- [X] Many issues relate back to prompt filtering.  Would it benefit
  us to make the prompt empty?

  Possibly. However, the implementation of
  =org-babel-comint-with-output= limits us.  It searches on the
  =comint-prompt-regexp=.  If that's empty, then it will probably
  search more than we want to (not per line or per call).

- [X] Does =org-babel-comint-with-output= use a filter?  If not, why

  =org-babel-comint-with-output= is a deep rabbit hole.  It was
  created in 2009 by Eric Schulte.  It appears that originally plain
  filters were used.  The =org-babel-comint-with-output= macro was
  introduced to manage the filters.  The macro was simple then.  It is
  no longer simple (although Ihor has recently simplified it (thank
  you!)).  This will require more research.

- [ ] How does =org-babel-comint-with-output= work?

  Why is it a macro?  Does it need to be a macro?

*** [0/0] Refactor
** DONE defvar ob-shell-async-indicator
#+name: ob-shell-async-indicator
#+begin_src emacs-lisp :eval never :tangle no
(defconst ob-shell-async-indicator "echo 'ob_comint_async_shell_%s_%s'"
  "Session output delimiter template.
See `org-babel-comint-async-indicator'.")

Similar to =org-babel-sh-eoe-indicator=.

Used by =org-babel-comint-async-delete-dangling-and-eval= within

Template for a shell command.  Used to construct strings that delimit
results within the comint process.  Used only with async sessions.
See [[* How blocks execute to obtain results][How blocks execute to obtain results]].

=org-babel-sh-evaluate= formats two indicators, a "start" and an
"end", along with a [[https://en.wikipedia.org/wiki/Universally_unique_identifier][Universally Unique IDentifier (UUID)]].  The UUID is
a placeholder used until results are available.  When the process
returns, the UUID is replaced with the results.

*** [0/0] Questions
*** [0/1] Refactor
- [ ] Change the "namespace" from "ob-shell-"

** DONE defcustom org-babel-shell-names
#+name: org-babel-shell-names-2
#+begin_src emacs-lisp :eval never :tangle no
(defcustom org-babel-shell-names
  '("sh" "bash" "zsh" "fish" "csh" "ash" "dash" "ksh" "mksh" "posh")
  "List of names of shell supported by babel shell code blocks.
Call `org-babel-shell-initialize' when modifying this variable
outside the Customize interface."
  :group 'org-babel
  :type '(repeat (string :tag "Shell name: "))
  :set (lambda (symbol value)
         (set-default-toplevel-value symbol value)

Used to define shell languages supported by =ob-shell=.  Each item in
the list corresponds to the shell binary.

*** [0/1] Questions
- [ ] What would it take to add Windows cmd to this list?

  Probably being maintainer or contributor to cmdproxy.c :)

*** [0/0] Refactor
See Refactor for =org-babel-shell-initialize=.

As a long term personal goal, I'd like to support Windows cmd.  This
would be helpful for me at work.  It would also be a good pathway for
someone to free software.

** DONE Variable declarations
The following three forms automate variable definitions for the
various supported shells.

#+name: org-babel-default-header-args:shell
#+begin_src emacs-lisp :eval never :tangle no
(defvar org-babel-default-header-args:shell '())

This comes from [[https://git.sr.ht/~bzg/worg/tree/master/item/org-contrib/babel/ob-template.el#L83][=ob-template=]].  The only documentation for it says,
"optionally declare default header arguments for this language".  The
variable is not used anywhere in the ob-shell code base.  Maybe it's
intended for end-users rather than developers?  Searching for
"org-babel-default-header-args" within the Org source, I see that
other languages use it in their code base.  This seems like something
that should stay and something that should be documented.

Later, =org-babel-shell-initialize= will define a similar variable for
each of the other supported shells.

#+name: org-babel-shell-names-1
#+begin_src emacs-lisp :eval never :tangle no
(defvar org-babel-shell-names)

This is a variable to hold the list of shell names supported by
=ob-shell=.  =org-babel-shell-names= is actually defined later on.
This definition acts as a forward declaration so that
=org-babel-initialize= works.

#+name: org-babel-shell-set-prompt-commands
#+begin_src emacs-lisp :eval never :tangle no
(defconst org-babel-shell-set-prompt-commands
  '(;; Fish has no PS2 equivalent.
    ("fish" . "function fish_prompt\n\techo \"%s\"\nend")
    ;; prompt2 is like PS2 in POSIX shells.
    ("csh" . "set prompt=\"%s\"\nset prompt2=\"\"")
    ;; PowerShell, similar to fish, does not have PS2 equivalent.
    ("posh" . "function prompt { \"%s\" }")
    ;; PROMPT_COMMAND can override PS1 settings.  Disable it.
    ;; Disable PS2 to avoid garbage in multi-line inputs.
    (t . "PROMPT_COMMAND=;PS1=\"%s\";PS2="))
  "Alist assigning shells with their prompt setting command.

Each element of the alist associates a shell type from
`org-babel-shell-names' with a template used to create a command to
change the default prompt.  The template is an argument to `format'
that will be called with a single additional argument: prompt string.

The fallback association template is defined in (t . \"template\")
alist element.")

Introduced as a result of the issue [[https://list.orgmode.org/CKK9TULBP2BG.2UITT31YJV03J@laptop/T/#mc8e3ca2f5f1b9a94040a68b4c6201234b209041c][babel output seems to drop
anything before % (in session)]].

This variable is only used with blocks containing the :session header.
It is used in =org-babel-sh-initiate-session= with
=org-babel-sh-prompt= to reset the prompt. Each command in
=org-babel-shell-set-prompt-commands= removes the prompt and replaces
it with whatever is in =org-babel-sh-prompt=, for the first two levels
of prompts (if the shell supports multiple shell level prompts).

Running the commands alone looks like this:

ahab@pequod ~ $ guix shell fish
ahab@pequod ~ [env]$ fish
ahab@pequod ~> function fish_prompt\n\techo \"%s\"\nend
ahab@pequod ~> function fish_prompt
                   echo "%s"
%secho "hi"

ahab@pequod ~ $ guix shell tsch
ahab@pequod ~ [env]$ tcsh
> set prompt="%s"
set prompt2=""

It replaces the prompt with "%s".  The "%s" is a format expression.

When =org-babel-shell-set-prompt-commands= is run by
=org-babel-sh-initiate-session=, the command for corresponding shell
gets fed into the lisp =format= command.  For example, the default
(which corresponds to bash) is


this is used by format:

(format "PROMPT_COMMAND=;PS1=\"%s\";PS2=" org-babel-sh-prompt)

this results in the following being called in a shell:

"PROMPT_COMMAND=;PS1=\"org_babel_sh_prompt> \";PS2="

*** [2/2] Questions
- [X] What is the required form for =org-babel-default-header-args:shell=?

  See =org-babel-default-header-args=:

  Default arguments to use when evaluating a source block.

  This is a list in which each element is an alist.  Each key
  corresponds to a header argument, and each value to that header’s
  value.  The value can either be a string or a closure that evaluates
  to a string.

- [X] Why is =org-babel-shell-names= defined with defvar only to be
  later defined with defcustom?

  It looks to me like this is the reason: =org-babel-shell-initialize=
  (explained below) requires =org-babel-shell-names=.  However,
  =org-babel-shell-initialize= isn't called until
  =org-babel-shell-names= is actually defined using defcustom.  When
  that happens, defcustom calls =org-babel-shell-initialize=
  (according to its :set feature).

*** [0/4] Refactor
- [ ] Document =org-babel-default-header-args:shell= and the other
  =org-babel-default-header-args:template= variables.  Maybe say that
  it is different for inline source blocks, too (see

- [ ] Find a better implementation than the =org-babel-shell-names=
  defined twice, along with =org-babel-shell-initialize=.

  This looks like a problem to me.  First, it's not documented.  It
  should be documented.  My guess as why it's not documented is
  because, even though defvar supports docstrings, the later call to
  defcustom clobbers it.  At the very least, there should be a lisp
  comment explaining what's going on.

  Since this is all tied up with =org-babel-shell-initialize=, see the
  comments there.

- [ ] Better document =org-babel-shell-set-prompt-commands=.  The way
  it's written doesn't explain why it exists or what it's for.  It
  explains a little about how it's used (the implementation (which may
  change)).  The comments within it are helpful, but only if you
  understand what the list is for.  Otherwise, it seems really to deal
  with setting PS2.

- [ ] Make =org-babel-shell-set-prompt-commands= private.  This is not
  something end-users should modify.

** PENDING defcustom org-babel-shell-results-defaults-to-output
#+name: org-babel-shell-results-defaults-to-output
#+begin_src emacs-lisp :eval never :tangle no
(defcustom org-babel-shell-results-defaults-to-output t
  "Let shell execution defaults to \":results output\".

When set to t, use \":results output\" when no :results setting
is set.  This is especially useful for inline source blocks.

When set to nil, stick to the convention of using :results value
as the default setting when no :results is set, the \"value\" of
a shell execution being its exit code."
  :group 'org-babel
  :type 'boolean
  :package-version '(Org . "9.4"))

Used to define the results header default.

I'm confused by this because it feels like there are several places in
the codebase which define "defaults" rather than one.  This concerns
me because it's not clear how they're incorporated into the broader
system.  The concern is basically having a state machine spread across
the system and running into issues where one clobbers the other.

*** [0/2] Questions
- [ ] How does this differ from =org-babel-default-header-args:shell=?

  The =org-babel-default-header-args:shell= isn't used directly
  anywhere in =ob-shell=.  =org-babel-default-header-args:<language>=
  is used within =org-babel-get-src-block-info=.

- [ ] Is this necessary or is it duplicating

*** [0/1] Refactor
- [ ] Fix grammar of docstring.

  "Make shell execution default to \":results output\""

** DONE defun org-babel-sh-initiate-session
#+name: org-babel-sh-initiate-session
#+begin_src emacs-lisp :eval never :tangle no
(defun org-babel-sh-initiate-session (&optional session _params)
  "Initiate a session named SESSION according to PARAMS."
  (when (and session (not (string= session "none")))
      (or (org-babel-comint-buffer-livep session)
            (shell session)
            ;; Set unique prompt for easier analysis of the output.
            (org-babel-comint-wait-for-output (current-buffer))
              (or (cdr (assoc (file-name-nondirectory shell-file-name)
                  (alist-get t org-babel-shell-set-prompt-commands))
            (setq-local comint-prompt-regexp
                        (concat "^" (regexp-quote org-babel-sh-prompt)
                                " *"))
            ;; Needed for Emacs 23 since the marker is initially
            ;; undefined and the filter functions try to use it without
            ;; checking.
            (set-marker comint-last-output-start (point))
            (get-buffer (current-buffer)))))))

The main purpose of this function is to start a [[https://www.gnu.org/software/emacs/manual/html_node/elisp/Process-Buffers.html][Process Buffer]] and set
it up so that we can scrape output from it.

When a initiating a new session, create a new process buffer with
=shell=.  Loop until the prompt appears. When the prompt appears, the
shell is ready to receive input.  Then, change the shell prompt and
update the =comint-prompt-regexp=.  Return the buffer associated with

Note that for Emacs 23 compatibility, we must manually set

*** [1/2] Questions
- [X] Why doesn't the calling function decide to execute this

  When a block is executed, =org-babel-execute-src-block= is called.
  The language is parsed from the header line and the appropriate
  dispatch function called (for example =org-babel-execute:bash=).
  All dispatch functions call =org-babel-execute:shell=.  The first
  thing it does is check for whether to use a session.  However, the
  logic for this is strange: the check happens in the initiate
  function =org-babel-sh-initiate-session=, not the calling function,
  =org-babel-execute:shell=. Should it happen in the caller?

  One way to consider which is the better way to call the function is
  to consider the information used to make the decision.  The caller
  will have all the necessary information.  For the callee to make the
  decision, it will need that information.  Therefore, having the
  callee decide whether or not to run requires passing all the
  necessary information to it so that it can make that decision.  This
  seems unnecessary: there is no obvious benefit to passing that
  information around.  Passing information requires memory, it
  requires manually writing it in as an argument.  Which information
  is necessary?  What if more information is needed yet isn't there?
  When information is passed to the callee to make the decision to run
  or not, it may lead to different data structures or in passing
  around "the world".  I think this is what we see with the _params
  parameter (which is a "world" variable that isn't used.)

  Fundamentally, having the function decide whether or not to run
  makes no sense.  If you're calling it, you've decided it should run.
  Why have the function second-guess you?  Doing this splits a single
  logical choice into two lexical locations.

- [ ] Why is _params a parameter?

  It's not used.  Was it used in a previous version?

*** [0/2] Refactor
- [ ] Move the decision to run the function to the caller
- [ ] Remove the _params parameter

** DONE defun org-babel-sh-var-to-string
#+name: org-babel-sh-var-to-string
#+begin_src emacs-lisp :eval never :tangle no
(defun org-babel-sh-var-to-string (var &optional sep hline)
  "Convert an elisp value to a string."
  (let ((echo-var (lambda (v) (if (stringp v) v (format "%S" v)))))
     ((and (listp var) (or (listp (car var)) (eq (car var) 'hline)))
      (orgtbl-to-generic var  (list :sep (or sep "\t") :fmt echo-var
                                    :hline hline)))
     ((listp var)
      (mapconcat echo-var var "\n"))
     (t (funcall echo-var var)))))

Convert a "value" to a string.  First define a helper function
=echo-var= to keep strings as strings and convert anything else using
the "%S" option of =format=.  You can read the documentation for
those, but it basically boils down to return primitive types like
numbers as-is and evaluate symbols first.

=org-babel-sh-var-to-string= behaves like follows:

,#+name: my-table
| col1  | col2   |
| test1 | test 2 |

,#+begin_src sh :stdin my-table :results output

: col1	col2
: test1	test 2

For a table, =var= comes in as:

(("col1" "col2") hline ("test1" "test 2"))

This is the result of =org-babel-ref-resolve=.  Unfortunately,
=org-babel-ref-resolve= doesn't document what that value will look
like.  This is what a table apparently looks like.  A raw Org table is
converted to a list of lists such that rows are nested lists or the
symbol 'hline.

The first condition of the cond checks for a table by way of "is it a
list and is the first element another list or the symbol 'hline?"
When true, the table contents are passed to =orgtbl-to-generic= which
converts it to a string of some sort according to some given
parameters.  In our case, it's hard coded as tab delimited strings and
no hlines.

The second condition checks for a list:

#+NAME: my-list
- simple
- list
- without
- nesting

#+begin_src sh :results output :stdin my-list

: simple
: list
: without
: nesting

In the case of a list, =var= comes through as a list of items:

("simple" "list" "without" "nesting")

=org-babel-sh-var-to-string= concatenates each element of the list
together with a newline.

Finally, the third condition is a catch-all that simply does the

#+name: text
The third case

#+begin_src sh :results output :stdin text

: The third case

Used in =org-babel-execute:shell= for stdin (to allow named tables as
inputs). Also used in =org-babel-sh-var-to-sh=.

*** [0/1] Questions
- [ ] Should the default separator be a tab or four spaces?

*** [0/3] Refactor
- [ ] Remove extra space between 'var' and the list of parameters in
  the call to =orgtbl-to-generic=.
- [ ] :hline parameter not used and defaults to nil.  Kill from parameter list.
- [ ] Change "var" to "val" (or something better)

  Function converts a "value" to a string, per the docstring.  This is
  true.  It converts some lisp expression (which is a value). Why then
  is it called 'var'? It appears to have nothing to do with a 'var'!
- [ ] Change 'echo-var' to something better

  It doesn't echo. And what is var? Maybe call it 'convert-to-string'.
- [ ] Document the conditional cases

  Say something about the expected form. Say that one is a table, the
  other a list, the other a value. Say that it is expected to be used
  with a name block.

** DONE defun org-babel-sh-var-to-sh
#+name: org-babel-sh-var-to-sh
#+begin_src emacs-lisp :eval never :tangle no
(defun org-babel-sh-var-to-sh (var &optional sep hline)
  "Convert an elisp value to a shell variable.
Convert an elisp var into a string of shell commands specifying a
var of the same value."
  (concat "'" (replace-regexp-in-string
               "'" "'\"'\"'"
               (org-babel-sh-var-to-string var sep hline))

This function quotes its argument.  The docstring inaccurately says,
"Convert an elisp value to a shell variable."  This is not true.  Yes,
it's used in a process of converting a lisp value to a shell variable.
However, that's not what this function does.

This function does two things: 1) each ' quote in the argument is
replaced by '"'"' and 2) the whole result is surrounded in single
quotes.  Step 1 is done to prevent single-quotes contained in the
argument from breaking the outer quotes added by step 2.  Step 2 is
done to ensure the quoted argument is interpreted as a literal.

This is easiest to see with a non-trivial example:

(org-babel-sh-var-to-sh "it'd") -> 'it'\"'\"'d'

See how the single quote is replaced by '"'"'?  In this case, the
double-quotes are escaped with backslashes.  It's maybe easier to see
if we separate the quoted single-quote with spaces:

'it '\"'\"' d'

All of this is made even more confusing because when these values are
viewed in Emacs, double-quotes are added around it (again to represent

Most likely this quoting is shell specific.  I assume it only applies
to Bourne-like shells.  In truth, I'm not sure.

This function is a helper used in the following:

- =org-babel--variable-assignments:sh-generic=
- =org-babel--variable-assignments:fish=
- =org-babel--variable-assignments:bash_array=
- =org-babel--variable-assignments:bash_assoc=

*** [2/3] Questions
- [ ] Is this quoting shell specific?
- [X] Does Emacs provide a function for this already?

  It seems not.

- [X] How do :hlines and :separator work?  Do they even work?

  Yes, but not well.

*** [0/1] Refactor
- [ ] Rename

  Quoting is fiddly and this exact process appears necessary for
  several different shells.  It makes sense to define it once to avoid
  needing to make multiple updates or need to debug multiple

** CURRENT defun org-babel--variable-assignments:bash_assoc
#+name: org-babel--variable-assignments:bash_assoc
#+begin_src emacs-lisp :eval never :tangle no
(defun org-babel--variable-assignments:bash_assoc
    (varname values &optional sep hline)
  "Return a list of statements declaring the values as bash associative array."
  (format "unset %s\ndeclare -A %s\n%s"
          varname varname
           (lambda (items)
             (format "%s[%s]=%s"
                     (org-babel-sh-var-to-sh (car items) sep hline)
                     (org-babel-sh-var-to-sh (cdr items) sep hline)))

2d array
"Return a list of statements declaring the values as bash associative array."
Returns string for a bash array declaration.

** TODO defun org-babel--variable-assignments:bash_array
#+name: org-babel--variable-assignments:bash_array
#+begin_src emacs-lisp :eval never :tangle no
(defun org-babel--variable-assignments:bash_array
    (varname values &optional sep hline)
  "Return a list of statements declaring the values as a bash array."
  (format "unset %s\ndeclare -a %s=( %s )"
          varname varname
           (lambda (value) (org-babel-sh-var-to-sh value sep hline))
           " ")))

simple list
"Return a list of statements declaring the values as a bash array."

** TODO defun org-babel--variable-assignments:sh-generic
#+name: org-babel--variable-assignments:sh-generic
#+begin_src emacs-lisp :eval never :tangle no
(defun org-babel--variable-assignments:sh-generic
    (varname values &optional sep hline)
  "Return a list of statements declaring the values as a generic variable."
  (format "%s=%s" varname (org-babel-sh-var-to-sh values sep hline)))

"Return a list of statements declaring the values as a generic variable."

Do a basic %s=%s assignment.

** TODO defun org-babel--variable-assignments:bash
#+name: org-babel--variable-assignments:bash
#+begin_src emacs-lisp :eval never :tangle no
(defun org-babel--variable-assignments:bash (varname values &optional sep hline)
  "Represent the parameters as useful Bash shell variables."
  (pcase values
    (`((,_ ,_ . ,_) . ,_)		;two-dimensional array
     (org-babel--variable-assignments:bash_assoc varname values sep hline))
    (`(,_ . ,_)				;simple list
     (org-babel--variable-assignments:bash_array varname values sep hline))
    (_					;scalar value
     (org-babel--variable-assignments:sh-generic varname values sep hline))))

Call appropriate assignment function:

- org-babel--variable-assignments:bash_assoc
- org-babel--variable-assignments:bash_array
- org-babel--variable-assignments:sh-generic

** TODO defun org-babel--variable-assignments:fish
#+name: org-babel--variable-assignments:fish
#+begin_src emacs-lisp :eval never :tangle no
(defun org-babel--variable-assignments:fish
    (varname values &optional sep hline)
  "Return a list of statements declaring the values as a fish variable."
  (format "set %s %s" varname (org-babel-sh-var-to-sh values sep hline)))

** TODO defun org-babel-variable-assignments:shell
#+name: org-babel-variable-assignments:shell
#+begin_src emacs-lisp :eval never :tangle no
(defun org-babel-variable-assignments:shell (params)
  "Return list of shell statements assigning the block's variables."
  (let ((sep (cdr (assq :separator params)))
        (hline (when (string= "yes" (cdr (assq :hlines params)))
                 (or (cdr (assq :hline-string params))
     (lambda (pair)
       (if (string-suffix-p "bash" shell-file-name)
            (car pair) (cdr pair) sep hline)
         (if (string-suffix-p "fish" shell-file-name)
              (car pair) (cdr pair) sep hline)
            (car pair) (cdr pair) sep hline))))
     (org-babel--get-vars params))))

Call appropriate variable assignment function depending on shell type
(i.e. suffix is bash or not). Each element of a :var header arg is

** TODO defun org-babel-prep-session:shell
#+name: org-babel-prep-session:shell
#+begin_src emacs-lisp :eval never :tangle no
(defun org-babel-prep-session:shell (session params)
  "Prepare SESSION according to the header arguments specified in PARAMS."
  (let* ((session (org-babel-sh-initiate-session session))
         (var-lines (org-babel-variable-assignments:shell params)))
    (org-babel-comint-in-buffer session
      (mapc (lambda (var)
              (insert var) (comint-send-input nil t)
              (org-babel-comint-wait-for-output session))

** TODO defun org-babel-load-session:shell
#+name: org-babel-load-session:shell
#+begin_src emacs-lisp :eval never :tangle no
(defun org-babel-load-session:shell (session body params)
  "Load BODY into SESSION."
    (let ((buffer (org-babel-prep-session:shell session params)))
      (with-current-buffer buffer
        (goto-char (process-mark (get-buffer-process (current-buffer))))
        (insert (org-babel-chomp body)))

Start a process if none exists and insert body into the associated
process buffer.

Initiate a comint, if needed. Define any variables that are given in
the header args. Return buffer associated with the session.

** TODO defun org-babel-sh-strip-weird-long-prompt
#+name: org-babel-sh-strip-weird-long-prompt
#+begin_src emacs-lisp :eval never :tangle no
(defun org-babel-sh-strip-weird-long-prompt (string)
  "Remove prompt cruft from a string of shell output."
  (while (string-match "^% +[\r\n$]+ *" string)
    (setq string (substring string (match-end 0))))

This function does what it says. Used only in =org-babel-sh-evaluate=
and only for sessions.

** TODO defun ob-shell-async-chunk-callback
#+name: ob-shell-async-chunk-callback
#+begin_src emacs-lisp :eval never :tangle no
(defun ob-shell-async-chunk-callback (string)
  "Filter applied to results before insertion.
See `org-babel-comint-async-chunk-callback'."
  (replace-regexp-in-string comint-prompt-regexp "" string))

** TODO defun org-babel-sh-evaluate
#+name: org-babel-sh-evaluate
#+begin_src emacs-lisp :eval never :tangle no
(defun org-babel-sh-evaluate (session body &optional params stdin cmdline)
  "Pass BODY to the Shell process in BUFFER.
If RESULT-TYPE equals `output' then return a list of the outputs
of the statements in BODY, if RESULT-TYPE equals `value' then
return the value of the last statement in BODY."
  (let* ((shebang (cdr (assq :shebang params)))
         (async (org-babel-comint-use-async params))
         (results-params (cdr (assq :result-params params)))
          (or (and
               (equal '("replace") results-params)
               (not org-babel-shell-results-defaults-to-output))
              (member "value" results-params)))
           ((or stdin cmdline)	       ; external shell script w/STDIN
            (let ((script-file (org-babel-temp-file "sh-script-"))
                  (stdin-file (org-babel-temp-file "sh-stdin-"))
                  (padline (not (string= "no" (cdr (assq :padline params))))))
              (with-temp-file script-file
                (when shebang (insert shebang "\n"))
                (when padline (insert "\n"))
                (insert body))
              (set-file-modes script-file #o755)
              (with-temp-file stdin-file (insert (or stdin "")))
                 (apply #'process-file
                        (if shebang (file-local-name script-file)
                        (if shebang (when cmdline (list cmdline))
                          (list shell-command-switch
                                (concat (file-local-name script-file)  " " cmdline)))))
           (session			; session evaluation
            (if async
                  (let ((uuid (org-id-uuid)))
                      (insert (format ob-shell-async-indicator "start" uuid))
                      (comint-send-input nil t)
                      (insert (org-trim body))
                      (comint-send-input nil t)
                      (insert (format ob-shell-async-indicator "end" uuid))
                      (comint-send-input nil t))
                (butlast ; Remove eoe indicator
                     (session org-babel-sh-eoe-output t body)
                   (insert (org-trim body) "\n"
                   (comint-send-input nil t))
                 ;; Remove `org-babel-sh-eoe-indicator' output line.
           ;; External shell script, with or without a predefined
           ;; shebang.
           ((org-string-nw-p shebang)
            (let ((script-file (org-babel-temp-file "sh-script-"))
                  (padline (not (equal "no" (cdr (assq :padline params))))))
              (with-temp-file script-file
                (insert shebang "\n")
                (when padline (insert "\n"))
                (insert body))
              (set-file-modes script-file #o755)
              (if (file-remote-p script-file)
                  ;; Run remote script using its local path as COMMAND.
                  ;; The remote execution is ensured by setting
                  ;; correct `default-directory'.
                  (let ((default-directory (file-name-directory script-file)))
                    (org-babel-eval (file-local-name script-file) ""))
                (org-babel-eval script-file ""))))
           (t (org-babel-eval shell-file-name (org-trim body))))))
    (when (and results value-is-exit-status)
      (setq results (car (reverse (split-string results "\n" t)))))
    (when results
      (let ((result-params (cdr (assq :result-params params))))
        (org-babel-result-cond result-params
          (let ((tmp-file (org-babel-temp-file "sh-")))
            (with-temp-file tmp-file (insert results))
            (org-babel-import-elisp-from-file tmp-file)))))))

This is where all the functionality lives.

** ON-HOLD defun org-babel-execute:shell
Return here after explaining the functions used within here.

#+name: org-babel-execute:shell
#+begin_src emacs-lisp :eval never :tangle no
(defun org-babel-execute:shell (body params)
  "Execute Shell BODY according to PARAMS.
This function is called by `org-babel-execute-src-block'."
  (let* ((session (org-babel-sh-initiate-session
                   (cdr (assq :session params))))
         (stdin (let ((stdin (cdr (assq :stdin params))))
                  (when stdin (org-babel-sh-var-to-string
                               (org-babel-ref-resolve stdin)))))
         (results-params (cdr (assq :result-params params)))
          (or (and
               (equal '("replace") results-params)
               (not org-babel-shell-results-defaults-to-output))
              (member "value" results-params)))
         (cmdline (cdr (assq :cmdline params)))
         (full-body (concat
                      body params (org-babel-variable-assignments:shell params))
                     (when value-is-exit-status "\necho $?"))))
     (org-babel-sh-evaluate session full-body params stdin cmdline)
      (cdr (assq :colname-names params)) (cdr (assq :colnames params)))
      (cdr (assq :rowname-names params)) (cdr (assq :rownames params))))))

The fundamental entry point for =ob-shell= when a source block is
evaluated.  The Org Babel API expects a function named
=org-babel-execute:template=.  Recall that
=org-babel-shell-initialize= generated functions called
"org-babel-execute:template".  When a user executes a block, the
=ob-core= function =org-babel-execute-src-block= is called.  It
dynamically binds "org-babel-execute:lang" to a variable "cmd" where
"lang" corresponds to whatever follows "#+begin_src" in the block
header ("lang" corresponds to what we've been calling "template").

#+begin_src emacs-lisp :eval never :tangle no
;; org-babel-execute-src-block

     (cmd (intern (concat "org-babel-execute:" lang)))  ; dynamically bind symbol
     result exec-start-time)
(setq exec-start-time (current-time)
      (let ((r (save-current-buffer (funcall cmd body params))))  ; call org-babel-execute:lang
        (if (and (eq (cdr (assq :result-type params)) 'value)

BODY is the inside of the source block.  PARAMS is an alist (see
below) of header arguments and other information.

*** [1/3] Questions
- [ ] The =org-babel-execute:template= functions and
  =org-babel-execute:shell= are all related to dispatch.  How might we
  dispatch differently and are any of the alternatives better?

- [X] Why does the docstring say "This function is called by

  It seems to me that it says this because it's trying to explain how
  the Org Babel API functions.  Unfortunately, how the Org Babel API
  functions is rather opaque.

- [ ] What is params?

  Params is a data structure whose specific structure depends on the
  context.  PARAMS, as seen within =org-babel-execute:shell=, stores
  information taken from the source block header as an [[info:elisp#Association Lists][alist]]:

  #+begin_src emacs-lisp :eval never :tangle no
  ;; What's stored in the 'params' parameter:
   (:result-params "replace")
   (:result-type . value)
   (:results . "replace")
   (:exports . "code")
   (:session . "none")
   (:cache . "no")
   (:noweb . "no")
   (:hlines . "no")
   (:tangle . "no"))

  ;; Typical access pattern:
  (cdr (assq :results
               (:result-params "replace")
               (:result-type . value)
               (:results . "replace")
               (:exports . "code")
               (:session . "none")
               (:cache . "no")
               (:noweb . "no")
               (:hlines . "no")
               (:tangle . "no"))
             ))  ; "replace"

  Note that PARAMS is defined differently elsewhere!  Some of these
  docstring descriptions, such as for
  =org-babel-variable-assignments=, are probably incorrect in calling
  it a plist.

  According to =org-babel--get-vars=,

  PARAMS is a quasi-alist of header args, which may contain multiple
  entries for the key `:var'.  This function returns a list of the cdr
  of all the `:var' entries."

  According to =org-babel--file-desc=,

  PARAMS is header argument values.

  According to =org-list--to-generic-plain-list=,

  PARAMS is a plist used to tweak the behavior of the transcoder.

  According to =org-babel-tangle--unbracketed-link=,

  The PARAMS are the 3rd element of the info for the same src block.

  According to =org-babel-comint-use-async=,

  PARAMS are the header arguments as passed to

  According to =org-babel-variable-assignments:plantuml=,

  PARAMS is a property list of source block parameters, which may
  contain multiple entries for the key `:var'.  `:var' entries in
  PARAMS are expected to be scalar variables."

*** [0/1] Refactor
- [ ] Remove the statement "This function is called by
  `org-babel-execute-src-block'." from the docstring.  Although a true
  statement, it's also true that this function is called by all of the
  =org-babel-execute:template= functions.

** DONE defun org-babel-shell-initialize
#+name: org-babel-shell-initialize
#+begin_src emacs-lisp :eval never :tangle no
(defun org-babel-shell-initialize ()
  "Define execution functions associated to shell names.
This function has to be called whenever `org-babel-shell-names'
is modified outside the Customize interface."
  (dolist (name org-babel-shell-names)
    (let ((fname (intern (concat "org-babel-execute:" name))))
      (defalias fname
        (lambda (body params)
           (format "Execute a block of %s commands with Babel." name))
          (let ((shell-file-name name))
            (org-babel-execute:shell body params))))
      (put fname 'definition-name 'org-babel-shell-initialize))
    (defalias (intern (concat "org-babel-variable-assignments:" name))
      (format "Return list of %s statements assigning to the block's \
    (funcall (if (fboundp 'defvar-1) #'defvar-1 #'set) ;Emacs-29
             (intern (concat "org-babel-default-header-args:" name))

Several different shells are supported.

Each needs the following functions:

  1. An alias =org-babel-execute:template= for =org-babel-execute:shell=
  2. An alias =org-babel-variable-assignments:name= for
  3. =org-babel-default-header-args:name=

These allow people to use any of the =org-babel-shell-names= as the
source block language.  For example, the =ob-core= functionality that
runs the source block (=org-babel-execute-src-block=) expects a
"org-babel-execute:template" function.

The way it works is this.  It iterates through the
=org-babel-shell-names= list.  For each element, it defines a symbol
for =org-babel-execute:template= according to the current list item
(name).  It makes this symbol an alias for a function that calls
=org-babel-execute:shell= such that =shell-file-name= is assigned to
the current list item.  For example, when processing "fish", a call to
=org-babel-execute:fish= will call =org-babel-execute:shell= yet any
call to an inferior shell process will use "fish" rather than the
shell stored in the SHELL environment variable at start up. (It's
assumed that the command, such as "fish", is accessible through PATH).
A "put" call makes =org-babel-execute:template= accessible from the built
in help.  It next creates an alias for
=org-babel-variable-assignments:name= corresponding to
=org-babel-variable-assignments:shell=.  Finally, it creates a symbol
for =org-babel-default-header-args:name=, using a different function
(either =set= or =defvar-1=) depending on the Emacs version.

*** [1/1] Questions
- [X] Is there a better way to do this?

  This sets off my alarms because it feels too clever.

  - probably complex for someone coming from a non-lispy background,
    creates a closure to modify =shell-file-name=, meta-programming
  - creates (what feels like) unnecessary dependencies between
    =org-babel-shell-names= and =org-babel-shell-initialize=
  - hints at being an end-user function when it shouldn't be used that
  - assumes that the things it's automating are all the same

  tl;dr Make =org-babel-shell-initialize= take a list parameter,
  rather than operating on =org-babel-shell-names= as a side effect.

  It has immediate problems when learning the codebase, as written
  about above for =org-babel-shell-names=.  It's feels convoluted to
  me.  Understanding =org-babel-shell-names= requires knowing how
  =org-babel-shell-initialize= works.  =org-babel-initialize= forces
  =org-babel-shell-names= to be defined twice.

  It seems like this function is trying to provide end-users with an
  easy way to update =ob-shell= with a new shell during run-time.
  That's cool and all...but that should be the job of the =ob-shell=
  maintainer.  It's not simple supporting different shells, even if
  they're similar.  What benefit is there in giving end-users a
  half-baked way to hang themselves?

  Further, I doubt many end-users are using =ob-shell= with a custom
  shell not supported.  There's one exception: cmd.exe on Windows.
  However, adding "cmd.exe" or "cmdproxy.exe" to
  =org-babel-shell-names= doesn't really work well.  Nor is it
  necessary: you can just use "shell".

  I think it's the automation part that concerns me most.  It's the
  fear of "What if there's an exception that doesn't fit into the
  automation".  My non-fear brain says, "Has that been a problem since
  this was implemented?  How hard would it be to change it when a
  problem arises that requires such a change?"  Leaving the
  implementation as is costs nothing (but conceptual energy).
  Re-working it will likely be easier later, if and when, a problem
  appears since there will be a clear goal.

  Part of me thinks it would be better to define
  =org-babel-shell-names= once, populated with all the shells and then
  to manually calling =org-babel-shell-initialize= to define the
  execution functions, rather than forward declare
  =org-babel-shell-names= and using the defcustom call to trigger the

  Maybe make =org-babel-shell-initialize= take a parameter which is a
  list of shells to initialize? In this way, it would break the
  dependency on =org-babel-shell-names=.  Then we could declare
  =org-babel-shell-names= once and have it call the initialize

  1. Make =org-babel-shell-initialize= take a parameter which a list
     of shells to initialize
  2. Define the list =org-babel-shell-names=
  3. Set =org-babel-shell-names= to call =org-babel-shell-initialize=
     when set.
     + Can defcustom refer to itself during the set call? Yes! It
       seems so.

*** [0/3] Refactor
- [ ] Make =org-babel-shell-initialize= private.  No one but the
  maintainer should be running that.

- [ ] Document the interaction between =org-babel-shell-initialize=
  and =org-babel-shell-names=

- [ ] Make =org-babel-shell-initialize= functional

  1. Make =org-babel-shell-initialize= take a parameter which is a
     list of shells to initialize
  2. Define the list =org-babel-shell-names= using defcustom
  3. Set =org-babel-shell-names= to call =org-babel-shell-initialize=
     when set.

     For example,

     #+begin_src emacs-lisp :results none :tangle no
     (defun my-initialize (initialization-list)
       (dolist (name initialization-list)
         (message "%s" (concat "my-automatically-created-symbol:" name))))

     (defcustom my-name-list
       '("banana" "Rama")
       "List of names to be created by `my-initialize'"
       :group 'my-test
       :type '(repeat (string :tag "Name to create: "))
       :set (lambda (symbol value)
              (set-default-toplevel-value symbol value)
              (my-initialize my-name-list)))

* Undocumented behaviors
** Problem: Undocumented functionality
Some of =ob-shell.el= is dedicated to undocumented behavior.  The
=org-babel-load-in-session= function which is bound to the
=org-metaup-hook=, M-up.  The source body will be inserted in a new
session process buffer.

For this is the code path (M-up):
- =ob-core:org-babel-load-in-session=
- =ob-shell:org-babel-prep-session:shell=

Now documented, yet a good example of this, is the =:dir= header.
There have been many questions on the mailing list regarding shell
blocks and ssh.  The =:dir= header works with TRAMP to create a remote
connection.  This works seamlessly with Babel.  However, it was not
clearly documented (the only example showed an R code block and is
located in a separate part of the manual).

* Addendum
** Mailing list items
Discussion on refactoring all non-session calls to be run as scripts,
including bugs to fix
- https://lists.gnu.org/archive/html/emacs-orgmode/2023-11/msg00137.html
- https://lists.gnu.org/archive/html/emacs-orgmode/2023-11/msg00142.html

Bug report and patches for :cmdline failings.
- https://lists.gnu.org/archive/html/emacs-orgmode/2023-11/msg00263.html

Powered by peut-publier

©2023 Excalamus.com