Understanding ob-shell.el

org-babel enables embedding executable source code within a document. Babel differs from other "notebook" applications in that the source code, execution parameters, and results have the same representation as the rest of the document. It's all text. Delimiters surround the source code and distinguish it from non-code text, as well as provide options for execution. When run, source code is isolated from the non-code text, executed in a subprocess, and some result inserted into the document. Source code is a "first class citizen". Code, prose, and results are interchangeable.

ob-shell.el defines functions and variables that manage the execution of several shell languages.

It appears that a high level consideration of shell evaluation hasn't happened in at least a decade. Instead, the library has progressed by piecemeal fixes. This has resulted in local optimizations. Abstractions and interfaces are not always cohesive. I suspect this is a source of bugs and that it deters contributions. There have been several recent calls for maintainers. My hope is that this document and the changes it inspires will encourage new people to join the project as well as make Org more reliable.

Updating ob-shell.el is no trivial matter. It's a classic case of Chesterton's fence,

"In the matter of reforming things, as distinct from deforming them, there is one plain and simple principle; a principle which will probably be called a paradox. There exists in such a case a certain institution or law; let us say, for the sake of simplicity, a fence or gate erected across a road. The more modern type of reformer goes gaily up to it and says, 'I don't see the use of this; let us clear it away.' To which the more intelligent type of reformer will do well to answer: 'If you don't see the use of it, I certainly won't let you clear it away. Go away and think. Then, when you can come back and tell me that you do see the use of it, I may allow you to destroy it."

This document is a record of "going away and thinking." It presents an understanding of why things are the way they currently are, where "current" means org-mode git commit 9183e3c72. Each section of code is analyzed, questions are asked, and recommendations given. See [BROKEN LINK: * Validation] within the source file for how to compare a checked out version of the code to the code reviewed in this document.

Source file: https://excalamus.com/understanding-ob-shell-raw.html

Table of Contents

org-babel

Org Babel started as a package called "litorgy" around February 22, 2009. Support for shells, litorgy-shell.el, was added in commit e7701000 on March 31, 2009. The current file lisp/ob-shell.el was created in commit 583e7ab1 on December 13, 2013.

An incomplete history of ob-shell is here:

Date Commit Action
March 31, 2009 e7701000 litorgy-shell.el created
May 24, 2009 2b9b2ee7 renamed to org-babel/org-babel-shell.el
May 24, 2009 e82fb256 (parent)
May 24, 2009 8e9d3a51 (parent)
May 24, 2009 9247200b renamed to lisp/org-babel-shell.el
June 12, 2009 b64be813 renamed to lisp/langs/org-babel-shell.el
June 14, 2009 4fd1abec renamed to lisp/langs/org-babel-sh.el
June 17, 2010 e0e4d760 renamed to lisp/babel/langs/ob-sh.el
June 17, 2010 04152d3a renamed to lisp/babel/langs/ob-sh.el (from contrib?)
December 13, 2013 583e7ab1 renamed from lisp/ob-sh.el to lisp/ob-shell.el

ON-HOLD Difference between a "shell" and a "terminal"

On-hold until I can find the brain power to dig into kernel drivers and terminal I/O more.

The following is Linux centric because that's what information is available. Other resources included MINIX. All of these are inspired by UNIX. Maybe it's more accurate to say it leans POSIX.

A "shell" is a command driven process launcher. It reads input, evaluates it, prints results, and loops back to reading input. The following demonstrates how a simple shell operates. It's easy to imagine the example extended to receive input in batch. Such batch input to a shell is a "script."

/*
  * https://stevens.netmeister.org/631/
  */

#include <sys/types.h>
#include <sys/wait.h>

#include <errno.h>
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <sysexits.h>
#include <unistd.h>

char *
getinput (char *buffer, size_t buflen)
{
  printf ("$$ ");
  return fgets (buffer, buflen, stdin);
}

int
main (int argc, char **argv)
{
  char buf[BUFSIZ];
  pid_t pid;
  int status;

  /* cast to void to silence compiler warnings */
  (void) argc;
  (void) argv;

  while (getinput (buf, sizeof (buf)))
    {
      buf[strlen (buf) - 1] = '\0';

      if ((pid = fork ()) == -1)
        {
          fprintf (stderr, "shell: can't fork: %s\n", strerror (errno));
          continue;
        }
      else if (pid == 0)
        {                       /* child */
          execlp (buf, buf, (char *) 0);
          fprintf (stderr, "shell: couldn't exec %s: %s\n", buf,
                   strerror (errno));
          exit (EX_UNAVAILABLE);
        }

      /* parent waits */
      if ((pid = waitpid (pid, &status, 0)) < 0)
        {
          fprintf (stderr, "shell: waitpid error: %s\n", strerror (errno));
        }
    }

  exit (EX_OK);
}

POSIX defines a shell as:

A program that interprets sequences of text input as commands. It may operate on an input stream or it may interactively prompt and read commands from a terminal.

https://pubs.opengroup.org/onlinepubs/9699919799/basedefs/V1_chap03.html#tag_03_347

A "terminal" is an interface to a shell. A shell simply "interprets sequences of text" for the purpose of starting new processes. A terminal manages the sequences of text sent and received from a shell. The terms "console" and "command-line" are also used.

Historically, a terminal was a physical device. Early computers repurposed Teletypes, electrical mechanical devices for sending and receiving telegrams, for input and output. As computers developed, so did terminals. Gradually, terminals went from recycled devices resembling typewriters to simple computers with memory, screens, and basic computational ability. Still, the primary purpose of a terminal is preparing text for a shell and outputting text from the shell to the user.

Terminals prepare text in one of two ways: character-wise or line-wise. For example, if the user inputs "dsta", realizes their mistake, presses backspace three times, and then inserts "ata" before pressing enter, the shell receives 11 characters. If the terminal instead handled this as a line, then only the 5 characters are necessary. Which approach to use depends on the hardware and requirements. Yet even within this simple example is hidden complexity. What constitutes the end of a line? A carriage return? A line feed? Both? If both, must the user type both? This and many other tasks like echoes, signal handling, jobs control, and special characters processing are managed by the terminal. Such tasks constitute what's called "line discipline."

Terminals act as an interface in other ways. The POSIX standard has this gem of a definition for "terminal,"

A character special file that obeys the specifications of the general terminal interface.

https://pubs.opengroup.org/onlinepubs/9699919799/basedefs/V1_chap03.html#tag_03_401

A "character special file" is a file that refers to a device (such as a terminal device file) or that has special properties (such as /dev/null).

https://pubs.opengroup.org/onlinepubs/9699919799/nframe.html

So, according to POSIX, a terminal is a file that refers to a device such as a terminal. What that means is a terminal a file, as well as a physical device. Unfortunately, several "slights of hand" happen with the words "file", "device", and "terminal."

POSIX is a standardization of operating systems based on UNIX. If modular design is "the UNIX philosophy", then UNIX's motto must be "everything is a file." More accurately, "everything has a file descriptor." This means that (most) objects in the system have a numeric reference like 0, 1, or 2, which allow them to work with the system calls "open", "close", "read", and "write" as a stream of bytes, just like a file. Saying something is a "file" means it has the same interface as a file regardless of whether it's an electronic document, a keyboard, a mouse, or a terminal.

Any device we attach to our computer must communicate with the system. Data must flow from the device into our system and the system may send data back. Processes related to this I/O are often called "device drivers." Typically, there is one process, or driver, per device. UNIX-like operating systems drivers are represented by special files. The fact they're not typical files like a document makes them special; they're an interface. So, "device", "driver", and "file" often refer to the same thing: the interface the operating system presents for a connected device.

We've already seen that a file is another way to refer to a stream of bytes. Different devices group data within a stream in different ways. Some devices, such as a keyboard, operate one byte at a time. Other devices, like hard disks, use blocks of data. Since one byte corresponds to a character, such devices are called "character devices." Connecting our terms, "a character special file" is another way to say, "a driver associated with a device that sends and receives one byte at a time." Generally, devices which send and receive one byte at a time are called "serial" devices.

Physical terminals were some of the first devices ever connected to a computer. Indeed, the famous picture of Ken Thompson and Dennis Richie in front of a PDP-11 shows Ken at a Teletype Model 33 ASR teleprinter (probably interacting with UNIX). It's no surprise then that the device file used to represent a Teletype in UNIX is named "tty," a common abbreviation for (T)ele(ty)pes. TTY files have come to mean any character device.

In Linux, processes are grouped into sets of "sessions" in order to facilitate process management. Every session is tied to a terminal, called the "controlling terminal," from which processes in the session get their input and to which they send their output. A TTY device file corresponds to any device that could act as a controlling terminal. These are either serial devices or virtual consoles (full-screen terminal displays on the system video monitor). There are various naming conventions for these devices/files. However, they always correspond to a serial device or virtual console.

Sometimes it's desirable to enforce a line discipline without a physical device. For example, we may want to automate shell interaction using a program or we may want to connect Emacs to a shell. Linux provides a generalization of TTY drivers called pseudoterminals, abbreviated by PTY.

The man page for "pty" says it best,

A pseudoterminal (sometimes abbreviated "pty") is a pair of virtual character devices that provide a bidirectional communication channel. One end of the channel is called the master; the other end is called the slave.

The slave end of the pseudoterminal provides an interface that behaves exactly like a classical terminal. A process that expects to be connected to a terminal, can open the slave end of a pseudoterminal and then be driven by a program that has opened the master end. Anything that is written on the master end is provided to the process on the slave end as though it was input typed on a terminal. For example, writing the interrupt character (usually control-C) to the master device would cause an interrupt signal (SIGINT) to be generated for the foreground process group that is connected to the slave. Conversely, anything that is written to the slave end of the pseudoterminal can be read by the process that is connected to the master end.

Data flow between master and slave is handled asynchronously, much like data flow with a physical terminal. Data written to the slave will be available at the master promptly, but may not be available immediately. Similarly, there may be a small processing delay between a write to the master, and the effect being visible at the slave.

https://www.man7.org/linux/man-pages/man7/pty.7.html

Regurgitating some information, on a PTY, the master end sends it's input to the slave's output:

                    PTY
            +--------+-------+
     input  |        |       | output
      -------->==============+----->
            |        |       |
e.g. emacs  | master | slave | e.g. shell
            |        |       |
      <-----+==============<--------
     output |        |       | input
            +--------+-------+

I don't actually know what this diagram means.

For C examples of working with a PTY, see:

rkoucha's example is good yet it's interactive and not easily embedded within an org document. I could not automate it and must return to it.

PTYs are compared to bidirectional pipes. Here's an example of implementing bidirectional pipes (AFAIU) between two processes.

/* gcc -o pipe3-example pipe3-example.c && ./pipe3-example

   This example creates a "bidirectional" pipe.

   A pipe enables unidirectional data channel between processes.
   Using two pipes allows two processes to each send and receive data.

   The pipe() Linux system command (man 2 pipe) says,

   "The array pipefd is used to return two file descriptors referring
   to the ends of the pipe.  pipefd[0] refers to the read end of the
   pipe.  pipefd[1] refers to the write end of the pipe.  Data written
   to the write end of the pipe is buffered by the kernel until it is
   read from the read end of the pipe."

   fd[0]          fd[1]
   read --------> write
   in             out

   Reading happens been writing.  Something is put in the pipe before
   it is taken out.

   So, given two pipes, A and B, and two processes (c)hild and
   (p)arent,

                 A
   cread a[0]<------->a[1] pwrite
   pread b[0]<------->b[1] cwrite
                 B

   The child reads from a[0] after the parent writes to a[1].
   The parent reads from b[0] after the child writes to b[1].

 */

#include <unistd.h>
#include <stdlib.h>
#include <stdio.h>
#include <string.h>
#include <errno.h>


int
main ()
{
  int pipe_A[2];
  int pipe_B[2];
  ssize_t size_parent_read;
  ssize_t size_child_read;

  char parent_buffer[100];
  memset (parent_buffer, '\0', 100);

  char child_buffer[100];
  memset (child_buffer, '\0', 100);

  if (pipe (pipe_A) == 0 && pipe (pipe_B) == 0)
    {
      pid_t pid = fork ();
      if (pid == (pid_t) - 1)
        {
          fprintf (stderr, "Fork failed");
          exit (EXIT_FAILURE);
        }

      else if (pid > (pid_t) 0)
        {                       // Parent
          printf ("Parent writing...\n");
          write (pipe_A[1], "hello", sizeof("hello"));

          size_parent_read = read (pipe_B[0], parent_buffer, 99);
          printf ("Parent received: '%s'\n", parent_buffer);
        }

      else if (pid == (pid_t) 0)
        {                       // Child
          size_child_read = read (pipe_A[0], child_buffer, 99);
          printf ("Child received: '%s'\n", child_buffer);

          printf ("Child writing...\n");
          write (pipe_B[1], "world", sizeof("world"));
        }
    }
  exit (EXIT_SUCCESS);
}

How blocks execute to obtain results

Shell blocks have two basic execution models depending on whether it's a session or not (that is, a persistent environment or a temporary environment). In each case, a subprocess call is made to the corresponding shell program.

For non-sessions, the final lisp call is to call-process (via process-file). This runs the shell in a synchronous process. Stdin and stderr are output to a buffer when the process completes. The output is scraped, cleaned, and returned for the block results.

For sessions, the final lisp call is to process-send-string (via comint-send-string). When the first session block is run, a process buffer is created to collect output from a synchronous process (see Output from Processes). Block source is sent to the buffer process along with command to echo delimiters. Output is filtered until the ending delimiter appears. When the ending delimiter is found, output is scraped, cleaned, and returned for the block results.

NOTE Sessions are currently entangled with async! See "session" and "non-session" vs. "synchronous" and "asynchronous".

header final lisp call
shebang call-process
cmdline call-process
stdin call-process
session process-send-string
async process-send-string
default call-process

See https://lists.gnu.org/archive/html/emacs-orgmode/2023-11/msg00242.html

"session" and "non-session" vs. "synchronous" and "asynchronous"

"Session" means a shell environment is "persistent." Each call is executed in the same environment. State exists between calls. In the early-history of Babel, this was called the "imperative" style.

"Non-session" means a shell environment is "temporary." Each call is executed in an independent environment. State does not exist between calls. In the early-history of Babel, this was called the "functional" style.

"synchronous" means that execution prevents the user from editing the document while results are obtained.

"asynchronous" means that execution does not prevent the user from editing the document while results are obtained.

These concepts are conflated (or entangled) in org-babel.

  • Why can't non-session blocks run asynchronously?
  • Why not make asynchronous the default?

Framework versus implementation

Currently, ob-shell and org-babel provide a set of functions and variables that implement subprocess calls. org-babel, along with ob-template, imply a framework for creating new integrations with different applications. However, in practice, are those functions and variables set up to be composed? Not really.

Asynchronous execution was introduced by ob-python and adopted by ob-shell. Why can't ob-C execute asynchronously? Because the implementation is not set up as a framework.

The following hangs Emacs:

while (1) {}

TODO How many different ways are there to execute a shell block?

Executing a shell block requires starting a process.

Processes are synchronous or asynchronous.

Three primitives exist for making processes:

  1. make-process (asynchronous)
  2. call-process (synchronous)
  3. call-process-region (synchronous)

Output from processes typically go to a buffer. This may be changed and handled by a filter. call-process has an option to directly send output to a file.

Subprocesses inherent the default-directory and the environment from Emacs. The environment may be changed using process-environment.

There are two types of asynchronous connections: "pty" and "pipe". "pty" means "pseudoterminal", sometimes called a "pseudotty" (where "tty" means "teletype"). This is a

pty connections allow for job control (C-c, C-z, etc.).

Shell blocks have three forms:

  1. a command pipeline
  2. shell scripting
  3. shebangs

Files paths have two forms:

  1. absolute
  2. relative ("expanded" (converted to absolute) using exec-path)

File paths are converted to absolute paths ("expanded") using exec-path. exec-path is initialized using PATH and serves a similar role within Emacs.

Files may be local or remote.

  • By directly passing it to the shell
  • Putting the source in a file and running

Processes have several return values:

  • success or failure of the process
  • stdout of the process
  • stderr of the process

[0/2] Questions

  • [ ]

    How might viewing ob-shell as a terminal change our approach to its design?

    According to definitions of "shell" and "terminal", ob-shell is a terminal.

  • [ ] Why does the pipe3-example returns results in a different order than when run in xfce4-terminal?

ob-shell

[2/9] General obshell: Questions

  • [ ]

    How is the word "shell" used within the ob-shell API?

    "shell" is the Babel package name. Babel packages have the form ob-<language>. However, "<language>" isn't always one-to-one. For example, ob-C defines functionality for D and C++.

    #+begin_src shell
    echo "hi"
    #+end_src
    

    It's confusing because it's also possible to call "sh". However, "sh" is a specific shell application, the Bourne shell! Unfortunately, "sh" is often linked to bash which confuses this point.

    The "shell" word within the ob-shell source code is used as a common form in which specific shell APIs are based. For example, org-babel-shell-initialize creates functions of the form "org-babel-execute:template" where "template" is a specific shell. The specific execute functions call the generic org-babel-execute:shell function.

  • [ ] Should ob-shell and ob-eshell be separate packages?
  • [ ]

    What is the execution path for inline versus non-inline blocks?

    Looking at org-babel-default-header-args (which is apparently only for inline blocks) and org-babel-default-header-args:template, the two seem separate. Should they be separate? Could they be combined? How does this affect things like org-babel-shell-results-defaults-to-output?

  • [ ]

    What is "posh"?

    I think it's supposed to mean "Powershell". I don't use it but I've never heard it referred to that way, nor does MicroSoft seem to use it. If it's something that's in the code base (and therefore something we "support"), then we should use the correct name for it (or get rid of it).

    NNNNNNOOOOOO….according to the Worg page, it stands for "Policy-compliant Ordinary SHell". https://packages.qa.debian.org/p/posh.html

    However, org-babel-shell-set-prompt-commands has a comment saying it's Powershell. So which is it?

  • [ ]

    Would it make sense for sessions to have the process buffer exist without a prompt? Would this fix problems we see of prompts being returned in the results?

    Such a change would likely require updating org-babel-comint-with-output.

  • [X]

    What is meant by "support"?

    Not all functionality is consistent nor available for some shells. This is not necessarily because of technical limitations of the shell. The biggest culprit is probably Powershell. Obviously, the software comes with no warranty or guarantee.

    However, I think it would be nice to provide as consistent behavior as possible. So, I would say "support" means "we'll try to make things consistent but no guarantee :)"

  • [ ]

    Is there a better way to handle the "variable quoting"

    1. Convert to string
    2. Quote
    3. Use quoted version to define a shell variable

    This seems reasonable.

  • [X]

    How do (or even do) :hlines and :sep or :separator work with ob-shell?

    These are strewn throughout the code base. However, none of them are documented, nor is their use clear. They should be fully implemented and documented or removed.

    There is no mention of the :separator keyword anywhere in the manual. There is mention of a :sep keyword in the documentation for Texinfo export. The string ":sep" appears twice in the ob-shell.el source:

    2 matches for ":sep" in buffer: ob-shell.el
       211:  (let ((sep (cdr (assq :separator params)))
       241:      (orgtbl-to-generic var  (list :sep (or sep "\t") :fmt echo-var
    

    These appear to perform the same role: specify how to separate items in a table. However, the syntax is different and (as we shall see) the functionality not quite implemented.

    :hlines twice in in the manual. First, with respect to columnview blocks and again with Python blocks. columnview blocks are not source blocks, so while the concept is probably similar, it's not clear how it translates (if at all) to source blocks. Regarding, Python, the manual says,

    In-between each table row or below the table headings, sometimes results have horizontal lines, which are also known as "hlines". The 'hlines' argument with the default 'no' value strips such lines from the input table. For most code, this is desirable, or else those 'hline' symbols raise unbound variable errors. A 'yes' accepts such lines, as demonstrated in the following example.

    Adapting the (only) example from the info manual, I can sort of get an answer:

    #+NAME: many-cols
    | a | b | c |
    |---+---+---|
    | d | e | f |
    |---+---+---|
    | g | h | i |
    
    #+BEGIN_SRC sh :var tab=many-cols :hlines yes
    echo $tab
    #+END_SRC
    
    #+RESULTS:
    : a b c hline d e f hline g h i
    
    #+BEGIN_SRC sh :var tab=many-cols :hlines yes
    echo "$tab"
    #+END_SRC
    
    #+RESULTS:
    | a     | b | c |
    | hline |   |   |
    | d     | e | f |
    | hline |   |   |
    | g     | h | i |
    

    Looking at the source code and playing around with it more, here is an example using :separator and :hlines. When :hlines is "yes", it requires the use of :hline-string to specify what the hline looks like, otherwise it defaults to "hline" (as we saw above).

    #+NAME: many-cols
    | a | b | c |
    |---+---+---|
    | d | e | f |
    |---+---+---|
    | g | h | i |
    
    #+BEGIN_SRC sh :var tab=many-cols :separator -- :hline-string ++ :hlines yes
    echo "$tab"
    #+END_SRC
    
    #+RESULTS:
    | a--b--c |
    | ++      |
    | d--e--f |
    | ++      |
    | g--h--i |
    

    If the :separator is a pipe, then the result is the usual table:

    #+NAME: many-cols
    | a | b | c |
    |---+---+---|
    | d | e | f |
    |---+---+---|
    | g | h | i |
    
    #+BEGIN_SRC sh :var tab=many-cols :separator | :hline-string ++ :hlines yes
    echo "$tab"
    #+END_SRC
    
    #+RESULTS:
    | a  | b | c |
    | ++ |   |   |
    | d  | e | f |
    | ++ |   |   |
    | g  | h | i |
    

    So, we see that the :hlines and :separator keywords produce some result. However, result leaves a lot to desire and it appears like the functionality is not fully implemented. The results are not meaningful and :hlines requires an undocumented header argument :hline-string.

  • [ ] Do people use :separator outside of Texinfo exports?

[1/8] General ob-shell: Refactor

  • [ ]

    Make all symbols follow the convention of org-babel-X where X is the specific language. "shell" is the generic name whereas something like "bash" or "sh" refers to a specific shell implementation.

    There are three names used:

    • org-babel-shell
    • org-babel-sh
    • ob-shell

    This module was originally called ob-sh and later changed to ob-shell. When this happened, the meaning of the sh changed. Before, it meant a generic shell. "shell" now means a generic "shell" changing the meaning of "sh". "sh" refers now to a specific shell (/bin/sh) since that's the call that's actually made when executing. Are "sh" functions specific to the "sh" shell or are they generic?

    It looks like when I submitted my async changes, I kept the ob-shell prefix. My preference is for ob-shell since to matches the filename. However, it shouldn't be there if it's not consistent with the rest of the file (or babel system). The problem with introducing ob-shell as a prefix is that none of the other babel files use that convention.

    It looks like ob-C uses org-babel-X where X is the language. This is the only other Babel module to support multiple languages in a single library.

  • [X]

    Add file local variable for nameless

    -- nameless-current-name: "org-babel-shell"; --

    This should be handled by a .dirlocals. It's not proper to have third party code referenced in the mainline repo. The .dirlocals should also be local and not part of the repo.

  • [ ]

    Fixed mixed tabs/spaces and remove trailing whitespace

    Both issues make preparing and reviewing commits needlessly complex. Executive decision: use spaces and remove trailing whitespace. We're not doing embedded work, so there's no point to space saving with tabs. It should be one or the other. Since the rest of the code is indented by spaces and the gnu standard is two spaces, I think we should enforce only spaces.

  • [ ]

    Rearrange definitions. The order of definitions doesn't assist the process of learning what this library does. Aside from the entangled org-babel-shell-initialize / org-babel-shell-names, order doesn't matter for execution. So, we should optimize the order for understanding.

    Helper terms defined after usage Organization of the ob-shell hinders understanding. "Helper functions" are defined at the end of the file. This makes reading the code difficult; each function is defined in terms only introduced later on.

    Solution: Reorder function declarations

    1. org-babel-sh-eoe-output
    2. org-babel-sh-eoe-indicator
    3. org-babel-shell-names
    4. org-babel-default-header-args:shell
    5. org-babel-shell-results-defaults-to-output
    6. org-babel-sh-initiate-session
    7. org-babel-sh-var-to-string
    8. org-babel-sh-var-to-sh
    9. org-babel–variable-assignments:sh-generic
    10. org-babel–variable-assignments:basharray
    11. org-babel–variable-assignments:bashassoc
    12. org-babel–variable-assignments:bash
    13. org-babel–variable-assignments:fish
    14. org-babel-variable-assignments:shell
    15. org-babel-prep-session:shell
    16. org-babel-load-session:shell
    17. org-babel-sh-strip-weird-long-prompt
    18. org-babel-sh-evaluate
    19. org-babel-execute:shell
    20. org-babel-shell-initialize
  • [ ]

    Remove needless helper functions

    The way the code is split up doesn't seem to really warrant being split up. The subdivisions are too small. Most of the helper functions are only used once. If that's the case, is it really helpful to break them out separately? My opinion is no. That might be the case if the function names and docstrings helped understanding the use and context. In my opinion, they don't. If they were instead not broken out and appeared in the context they are used, the context would be clear. The role of the function names could be satisfied by comments. This may be worth refactoring, considering it took me several hours to make sense of things.

  • [ ]

    Implement a logical separation of concerns

    Aside from the useless helper functions, it's not clear why functions like org-babel-execute:shell and org-babel-sh-evaluate do exactly what they do. There's initiating the process, getting the inputs, dispatching, and then providing output. All of this appears to split, ad hoc, between (at least) these two functions. Review and consider a different break down of functionality. See the individual functions for details on how this might be done.

  • [ ] Fix confusion about what "posh" is
  • [ ]

    Remove :sep, :separator, and :hlines code

    First, they are not documented. I doubt they are, or ever have been, used.

    Second, they do not appear to be implemented correctly. Specifying :hlines along with :hline-string produces a result that doesn't appear useful.

    One option would be to implement these. Doing this would require 1) understanding the tooling that currently exists and 2) integrating it into the shell context.

    Another option is to remove it until someone requests it.

    Given all the problems that exist currently with shells and the refactoring that needs to happen to make a consistent framework, I would prefer to simply remove this code.

ob-shell code analysis

DONE Top matter

No commentary, just a standard Emacs header. Included for validation.

;;; ob-shell.el --- Babel Functions for Shell Evaluation -*- lexical-binding: t; -*-

;; Copyright (C) 2009-2023 Free Software Foundation, Inc.

;; Author: Eric Schulte
;; Maintainer: Matthew Trzcinski <matt@excalamus.com>
;; Keywords: literate programming, reproducible research
;; URL: https://orgmode.org

;; This file is part of GNU Emacs.

;; GNU Emacs is free software: you can redistribute it and/or modify
;; it under the terms of the GNU General Public License as published by
;; the Free Software Foundation, either version 3 of the License, or
;; (at your option) any later version.

;; GNU Emacs is distributed in the hope that it will be useful,
;; but WITHOUT ANY WARRANTY; without even the implied warranty of
;; MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
;; GNU General Public License for more details.

;; You should have received a copy of the GNU General Public License
;; along with GNU Emacs.  If not, see <https://www.gnu.org/licenses/>.

;;; Commentary:

;; Org-Babel support for evaluating shell source code.

;;; Code:

DONE Requires

(require 'org-macs)
(org-assert-version)

(require 'ob)
(require 'org-macs)
(require 'shell)
(require 'cl-lib)

[2/2] Questions

  • [X]

    What are each of the requires for and are they necessary?

    It appears like ob and shell are the only requires needed.

    Notice that ob-macs is given twice. The first require for org-macs is likely for org-assert-version. The second is just extra as far as I can tell.

    ob loads features from other Org Babel files. In fact, it loads org-macs and calls org-assert-version. This makes org-macs even more redundant.

    shell is needed for the shell command used in org-babel-sh-initiate-session.

    cl-lib appears unused. It was introduced to ob-shell.el in 2016 with commit 0f7df327, replacing the deprecated cl package. cl was fully removed from Emacs in version 27.

    grep -r "'cl' package is now officially deprecated" "$emacs_etc_dir" | cut -f 2 -d ":"
    

    Regarding backwards compatiblity, it appears that cl was never used (except for, perhaps, a brief period (that I could not find)).

    If ob-shell.el relied on cl, we would expect it was updated appropriately that when cl was deprecated. Since Emacs no longer contains the code for cl, if ob-shell.el required it, then it should fail. Since it doesn't, it seems safe to assume that all cl calls were updated to use cl-lib. The cl-lib uses a "cl-" prefix. However, we don't see that anywhere in ob-shell.el. We only see the require itself.

    grep "cl-" /tmp/org-mode/lisp/ob-shell.el
    

    It appears that cl-lib is not needed.

    Fun fact: the cl-lib has only ever been required (never used, according to the prefix).

    cd /tmp/org-mode/
    git grep "cl-" $(git rev-list --all -- lisp/ob-shell.el) -- lisp/ob-shell.el
    
  • [X]

    Should comments tell what the requires are for?

    I feel like yes for the org-assert-version in ob, but no for shell. Maybe just no since ob is used for things other than org-assert-version.

[0/2] Refactoring

  • [ ] Remove all requires except ob and shell.
  • [ ] Add comment that ob should be listed first in order to check version

DONE Function forward declares

(declare-function org-babel-comint-in-buffer "ob-comint" (buffer &rest body)
                  t)
(declare-function org-babel-comint-wait-for-output "ob-comint" (buffer))
(declare-function org-babel-comint-buffer-livep "ob-comint" (buffer))
(declare-function org-babel-comint-with-output "ob-comint" (meta &rest body)
                  t)
(declare-function orgtbl-to-generic "org-table" (table params))

These declarations exist to satisfy the byte compiler (or to help it find the definitions of functions defined in other files).

[0/0] Questions

[0/0] Refactor

DONE defvar org-babel-sh-eoe-output

(defvar org-babel-sh-eoe-output "org_babel_sh_eoe"
  "String to indicate that evaluation has completed.")

Token used by org-babel-comint-with-output (within org-babel-sh-evaluate) to indicate that execution has completed. Used only by sessions. See How blocks execute to obtain results.

[1/3] Questions

  • [ ] Should this be renamed to indicate that it's only used by session? Maybe update the docstring at least?
  • [X]

    What is "eoe"?

    Probably "end of execution" or "end of evaluation".

  • [ ]

    Why is it called "output"?

    Maybe because this string is what's sent to the shell process (output)? However, that's not entirely true, as org-babel-sh-eoe-indicator is also sent. Maybe because it's what org-babel-comint-with-output searches for in the output?

[0/1] Refactor

  • [ ]

    Rename this

    It needs to have "sh" removed. EOE should be spelled out or a different term used. "output" should be made more clear.

DONE defvar org-babel-sh-eoe-indicator

(defvar org-babel-sh-eoe-indicator "echo 'org_babel_sh_eoe'"
  "String to indicate that evaluation has completed.")

Shell command used by org-babel-comint-with-output within org-babel-sh-evaluate. It echoes org-babel-sh-eoe-output within the comint, that being the token used to indicate that execution has completed. See How blocks execute to obtain results. It is used only with sessions.

#+begin_src sh :session *example*
echo "hello, world"
#+end_src

sh-5.1$ PROMPT_COMMAND=;PS1="org_babel_sh_prompt> ";PS2=
org_babel_sh_prompt> echo "hello, world"      <------ This is the org-babel-sh-eoe-indicator
echo 'org_babel_sh_eoe'
hello, world
org_babel_sh_prompt> org_babel_sh_eoe
org_babel_sh_prompt>

[2/3] Questions

  • [ ]

    What would it look like to explicitly define org-babel-sh-eoe-indicator in terms of org-babel-sh-eoe-output?

    If it's just (format "echo '%s'" org-babel-sh-eoe-output), then that avoids the (minor) problem of a typo creating a problem.

  • [X]

    Is this string the same for all the supported shells?

    No. Posh (powershell) uses something like "Write-Host". See https://stackoverflow.com/a/707666

  • [X]

    Does it matter that posh uses a different command to print to the prompt?

    Yes, it does if we want to support powershell. Personally, I'd rather support cmd.exe than powershell.

[0/2] Refactor

  • [ ]

    Rename this

    Remove "sh" and "eoe"

  • [ ]

    Fix docstring

    This is not the "string to indicate that evaluation has completed". It is the command used to call the string used to indicate that evaluation has completed.

DONE defvar org-babel-sh-prompt

(defvar org-babel-sh-prompt "org_babel_sh_prompt> "
  "String to set prompt in session shell.")

Prompt for shell sessions. This is set during org-babel-sh-initiate-session using the org-babel-shell-set-prompt-commands corresponding to the block language.

In practice, the prompt is given by the comint-prompt-regexp which looks like:

(concat "^" (regexp-quote org-babel-sh-prompt)     " *")
(concat "^" (regexp-quote "org_babel_sh_prompt> ") " *")
"^org_babel_sh_prompt>  *"

comint-prompt-regexp is required by org-babel-comint-with-output.

[3/4] Questions

  • [X]

    Why do we set the prompt?

    Because of how org-babel-comint-with-output uses comint-prompt-regexp.

  • [X]

    Many issues relate back to prompt filtering. Would it benefit us to make the prompt empty?

    Possibly. However, the implementation of org-babel-comint-with-output limits us. It searches on the comint-prompt-regexp. If that's empty, then it will probably search more than we want to (not per line or per call).

  • [X]

    Does org-babel-comint-with-output use a filter? If not, why not?

    org-babel-comint-with-output is a deep rabbit hole. It was created in 2009 by Eric Schulte. It appears that originally plain filters were used. The org-babel-comint-with-output macro was introduced to manage the filters. The macro was simple then. It is no longer simple (although Ihor has recently simplified it (thank you!)). This will require more research.

  • [ ]

    How does org-babel-comint-with-output work?

    Why is it a macro? Does it need to be a macro?

[0/1] Refactor

  • [ ]

    Indicate that this is required by org-babel-comint-with-output

    Or better yet, change how org-babel-comint-with-output functions so that this is not necessary.

DONE defvar ob-shell-async-indicator

(defconst ob-shell-async-indicator "echo 'ob_comint_async_shell_%s_%s'"
  "Session output delimiter template.
See `org-babel-comint-async-indicator'.")

Similar to org-babel-sh-eoe-indicator.

Used by org-babel-comint-async-delete-dangling-and-eval within org-babel-sh-evaluate.

Template for a shell command. Used to construct strings that delimit results within the comint process. Used only with async sessions. See How blocks execute to obtain results.

org-babel-sh-evaluate formats two indicators, a "start" and an "end", along with a Universally Unique IDentifier (UUID). The UUID is a placeholder used until results are available. When the process returns, the UUID is replaced with the results.

[0/0] Questions

[0/1] Refactor

  • [ ] Change the "namespace" from "ob-shell-"

DONE defcustom org-babel-shell-names

(defcustom org-babel-shell-names
  '("sh" "bash" "zsh" "fish" "csh" "ash" "dash" "ksh" "mksh" "posh")
  "List of names of shell supported by babel shell code blocks.
Call `org-babel-shell-initialize' when modifying this variable
outside the Customize interface."
  :group 'org-babel
  :type '(repeat (string :tag "Shell name: "))
  :set (lambda (symbol value)
         (set-default-toplevel-value symbol value)
         (org-babel-shell-initialize)))

Used to define shell languages supported by ob-shell. Each item in the list corresponds to the shell binary.

[0/1] Questions

  • [ ]

    What would it take to add Windows cmd to this list?

    Probably being maintainer or contributor to cmdproxy.c :)

[1/1] Refactor

  • [X] See Refactor for org-babel-shell-initialize.

As a long term personal goal, I'd like to support Windows cmd. This would be helpful for me at work. It would also be a good pathway for someone to free software.

DONE Variable declarations

The following three forms automate variable definitions for the various supported shells.

(defvar org-babel-default-header-args:shell '())

This comes from ob-template. The only documentation for it says, "optionally declare default header arguments for this language". The variable is not used anywhere in the ob-shell code base. Maybe it's intended for end-users rather than developers? Searching for "org-babel-default-header-args" within the Org source, I see that other languages use it in their code base. This seems like something that should stay and something that should be documented.

Later, org-babel-shell-initialize will define a similar variable for each of the other supported shells.

(defvar org-babel-shell-names)

This is a variable to hold the list of shell names supported by ob-shell. org-babel-shell-names is actually defined later on. This definition acts as a forward declaration so that org-babel-initialize works.

(defconst org-babel-shell-set-prompt-commands
  '(;; Fish has no PS2 equivalent.
    ("fish" . "function fish_prompt\n\techo \"%s\"\nend")
    ;; prompt2 is like PS2 in POSIX shells.
    ("csh" . "set prompt=\"%s\"\nset prompt2=\"\"")
    ;; PowerShell, similar to fish, does not have PS2 equivalent.
    ("posh" . "function prompt { \"%s\" }")
    ;; PROMPT_COMMAND can override PS1 settings.  Disable it.
    ;; Disable PS2 to avoid garbage in multi-line inputs.
    (t . "PROMPT_COMMAND=;PS1=\"%s\";PS2="))
  "Alist assigning shells with their prompt setting command.

Each element of the alist associates a shell type from
`org-babel-shell-names' with a template used to create a command to
change the default prompt.  The template is an argument to `format'
that will be called with a single additional argument: prompt string.

The fallback association template is defined in (t . \"template\")
alist element.")

Introduced as a result of the issue babel output seems to drop anything before % (in session).

This variable is only used with blocks containing the :session header. It is used in org-babel-sh-initiate-session with org-babel-sh-prompt to reset the prompt. Each command in org-babel-shell-set-prompt-commands removes the prompt and replaces it with whatever is in org-babel-sh-prompt, for the first two levels of prompts (if the shell supports multiple shell level prompts).

Running the commands alone looks like this:

ahab@pequod ~ $ guix shell fish
ahab@pequod ~ [env]$ fish
ahab@pequod ~> function fish_prompt\n\techo \"%s\"\nend
ahab@pequod ~> function fish_prompt
                   echo "%s"
               end
%secho "hi"
hi
%s
ahab@pequod ~ $ guix shell tsch
ahab@pequod ~ [env]$ tcsh
> set prompt="%s"
set prompt2=""
exit

It replaces the prompt with "%s". The "%s" is a format expression.

When org-babel-shell-set-prompt-commands is run by org-babel-sh-initiate-session, the command for corresponding shell gets fed into the lisp format command. For example, the default (which corresponds to bash) is

"PROMPT_COMMAND=;PS1=\"%s\";PS2="

this is used by format:

(format "PROMPT_COMMAND=;PS1=\"%s\";PS2=" org-babel-sh-prompt)

this results in the following being called in a shell:

"PROMPT_COMMAND=;PS1=\"org_babel_sh_prompt> \";PS2="

[2/2] Questions

  • [X]

    What is the required form for org-babel-default-header-args:shell?

    See org-babel-default-header-args:

    Default arguments to use when evaluating a source block.

    This is a list in which each element is an alist. Each key corresponds to a header argument, and each value to that header's value. The value can either be a string or a closure that evaluates to a string.

  • [X]

    Why is org-babel-shell-names defined with defvar only to be later defined with defcustom?

    It looks to me like this is the reason: org-babel-shell-initialize (explained below) requires org-babel-shell-names. However, org-babel-shell-initialize isn't called until org-babel-shell-names is actually defined using defcustom. When that happens, defcustom calls org-babel-shell-initialize (according to its :set feature).

[0/4] Refactor

  • [ ] Document org-babel-default-header-args:shell and the other org-babel-default-header-args:template variables. Maybe say that it is different for inline source blocks, too (see org-babel-default-header-args).
  • [ ]

    Find a better implementation than the org-babel-shell-names defined twice, along with org-babel-shell-initialize.

    This looks like a problem to me. First, it's not documented. It should be documented. My guess as why it's not documented is because, even though defvar supports docstrings, the later call to defcustom clobbers it. At the very least, there should be a lisp comment explaining what's going on.

    Since this is all tied up with org-babel-shell-initialize, see the comments there.

  • [ ] Better document org-babel-shell-set-prompt-commands. The way it's written doesn't explain why it exists or what it's for. It explains a little about how it's used (the implementation (which may change)). The comments within it are helpful, but only if you understand what the list is for. Otherwise, it seems really to deal with setting PS2.
  • [ ] Make org-babel-shell-set-prompt-commands private. This is not something end-users should modify.

DONE defcustom org-babel-shell-results-defaults-to-output

(defcustom org-babel-shell-results-defaults-to-output t
  "Let shell execution defaults to \":results output\".

When set to t, use \":results output\" when no :results setting
is set.  This is especially useful for inline source blocks.

When set to nil, stick to the convention of using :results value
as the default setting when no :results is set, the \"value\" of
a shell execution being its exit code."
  :group 'org-babel
  :type 'boolean
  :package-version '(Org . "9.4"))

Used to define the results header default.

I'm confused by this because it feels like there are several places in the codebase which define "defaults" rather than one. This concerns me because it's not clear how they're incorporated into the broader system. The concern is basically having a state machine spread across the system and running into issues where one clobbers the other.

[0/2] Questions

  • [ ]

    How does this differ from org-babel-default-header-args:shell?

    The org-babel-default-header-args:shell isn't used directly anywhere in ob-shell. org-babel-default-header-args:<language> is used within org-babel-get-src-block-info.

  • [ ] Is this necessary or is it duplicating org-babel-default-header-args:shell?

[0/1] Refactor

  • [ ]

    Fix grammar of docstring.

    "Make shell execution default to \":results output\""
    

DONE defun org-babel-sh-initiate-session

(defun org-babel-sh-initiate-session (&optional session _params)
  "Initiate a session named SESSION according to PARAMS."
  (when (and session (not (string= session "none")))
    (save-window-excursion
      (or (org-babel-comint-buffer-livep session)
          (progn
            (shell session)
            ;; Set unique prompt for easier analysis of the output.
            (org-babel-comint-wait-for-output (current-buffer))
            (org-babel-comint-input-command
             (current-buffer)
             (format
              (or (cdr (assoc (file-name-nondirectory shell-file-name)
                              org-babel-shell-set-prompt-commands))
                  (alist-get t org-babel-shell-set-prompt-commands))
              org-babel-sh-prompt))
            (setq-local comint-prompt-regexp
                        (concat "^" (regexp-quote org-babel-sh-prompt)
                                " *"))
            ;; Needed for Emacs 23 since the marker is initially
            ;; undefined and the filter functions try to use it without
            ;; checking.
            (set-marker comint-last-output-start (point))
            (get-buffer (current-buffer)))))))

The main purpose of this function is to start a Process Buffer and set it up so that we can scrape output from it.

When a initiating a new session, create a new process buffer with shell. Loop until the prompt appears. When the prompt appears, the shell is ready to receive input. Then, change the shell prompt and update the comint-prompt-regexp. Return the buffer associated with process.

Note that for Emacs 23 compatibility, we must manually set comint-last-output-start.

[1/2] Questions

  • [X]

    Why doesn't the calling function decide to execute this function?

    When a block is executed, org-babel-execute-src-block is called. The language is parsed from the header line and the appropriate dispatch function called (for example org-babel-execute:bash). All dispatch functions call org-babel-execute:shell. The first thing it does is check for whether to use a session. However, the logic for this is strange: the check happens in the initiate function org-babel-sh-initiate-session, not the calling function, org-babel-execute:shell. Should it happen in the caller?

    One way to consider which is the better way to call the function is to consider the information used to make the decision. The caller will have all the necessary information. For the callee to make the decision, it will need that information. Therefore, having the callee decide whether or not to run requires passing all the necessary information to it so that it can make that decision. This seems unnecessary: there is no obvious benefit to passing that information around. Passing information requires memory, it requires manually writing it in as an argument. Which information is necessary? What if more information is needed yet isn't there? When information is passed to the callee to make the decision to run or not, it may lead to different data structures or in passing around "the world". I think this is what we see with the _params parameter (which is a "world" variable that isn't used.)

    Fundamentally, having the function decide whether or not to run makes no sense. If you're calling it, you've decided it should run. Why have the function second-guess you? Doing this splits a single logical choice into two lexical locations.

  • [ ]

    Why is _params a parameter?

    It's not used. Was it used in a previous version?

[0/2] Refactor

  • [ ] Move the decision to run the function to the caller
  • [ ] Remove the _params parameter

DONE defun org-babel-sh-var-to-string

(defun org-babel-sh-var-to-string (var &optional sep hline)
  "Convert an elisp value to a string."
  (let ((echo-var (lambda (v) (if (stringp v) v (format "%S" v)))))
    (cond
     ((and (listp var) (or (listp (car var)) (eq (car var) 'hline)))
      (orgtbl-to-generic var  (list :sep (or sep "\t") :fmt echo-var
                                    :hline hline)))
     ((listp var)
      (mapconcat echo-var var "\n"))
     (t (funcall echo-var var)))))

Convert a "value" to a string. First define a helper function echo-var to keep strings as strings and convert anything else using the "%S" option of format. You can read the documentation for those, but it basically boils down to return primitive types like numbers as-is and evaluate symbols first.

org-babel-sh-var-to-string behaves like follows:

#+name: my-table
| col1  | col2   |
|-------+--------|
| test1 | test 2 |

#+begin_src sh :stdin my-table :results output :eval never-export
cat
#+end_src

#+RESULTS:
: col1	col2
: test1	test 2

For a table, var comes in as:

(("col1" "col2") hline ("test1" "test 2"))

This is the result of org-babel-ref-resolve. Unfortunately, org-babel-ref-resolve doesn't document what that value will look like. This is what a table apparently looks like. A raw Org table is converted to a list of lists such that rows are nested lists or the symbol 'hline.

The first condition of the cond checks for a table by way of "is it a list and is the first element another list or the symbol 'hline?" When true, the table contents are passed to orgtbl-to-generic which converts it to a string of some sort according to some given parameters. In our case, it's hard coded as tab delimited strings and no hlines.

The second condition checks for a list:

#+NAME: my-list
- simple
- list
- without
- nesting

#+begin_src sh :results output :stdin my-list :eval never-export
cat
#+end_src

#+RESULTS:
: simple
: list
: without
: nesting

In the case of a list, var comes through as a list of items:

("simple" "list" "without" "nesting")

org-babel-sh-var-to-string concatenates each element of the list together with a newline.

Finally, the third condition is a catch-all that simply does the conversion:

#+name: text
The third case

#+begin_src sh :results output :stdin text :eval never-export
cat
#+end_src

#+RESULTS:
: The third case

Used in org-babel-execute:shell for stdin (to allow named tables as inputs). Also used in org-babel-sh-var-to-sh.

[0/1] Questions

  • [ ] Should the default separator be a tab or four spaces?

[0/6] Refactor

  • [ ] Remove extra space between 'var' and the list of parameters in the call to orgtbl-to-generic.
  • [ ] :hline parameter not used and defaults to nil. Kill from parameter list.
  • [ ]

    Change "var" to "val" (or something better)

    Function converts a "value" to a string, per the docstring. This is true. It converts some lisp expression (which is a value). Why then is it called 'var'? It appears to have nothing to do with a 'var'!

  • [ ]

    Change 'echo-var' to something better

    It doesn't echo. And what is var? Maybe call it 'convert-to-string'.

  • [ ]

    Document the conditional cases

    Say something about the expected form. Say that one is a table, the other a list, the other a value. Say that it is expected to be used with a name block.

  • [ ] Document arguments

DONE defun org-babel-sh-var-to-sh

(defun org-babel-sh-var-to-sh (var &optional sep hline)
  "Convert an elisp value to a shell variable.
Convert an elisp var into a string of shell commands specifying a
var of the same value."
  (concat "'" (replace-regexp-in-string
               "'" "'\"'\"'"
               (org-babel-sh-var-to-string var sep hline))
          "'"))

This function quotes its argument. The docstring inaccurately says, "Convert an elisp value to a shell variable." This is not true. Yes, it's used in a process of converting a lisp value to a shell variable. However, that's not what this function does.

This function does two things: 1) each ' quote in the argument is replaced by '"'"' and 2) the whole result is surrounded in single quotes. Step 1 is done to prevent single-quotes contained in the argument from breaking the outer quotes added by step 2. Step 2 is done to ensure the quoted argument is interpreted as a literal.

This is easiest to see with a non-trivial example:

(org-babel-sh-var-to-sh "it'd") -> 'it'\"'\"'d'

See how the single quote is replaced by '"'"'? In this case, the double-quotes are escaped with backslashes. It's maybe easier to see if we separate the quoted single-quote with spaces:

'it '\"'\"' d'

All of this is made even more confusing because when these values are viewed in Emacs, double-quotes are added around it (again to represent literal).

Most likely this quoting is shell specific. I assume it only applies to Bourne-like shells. In truth, I'm not sure.

This function is a helper used in the following:

  • org-babel--variable-assignments:sh-generic
  • org-babel--variable-assignments:fish
  • org-babel--variable-assignments:bash_array
  • org-babel--variable-assignments:bash_assoc
this is a test
of a ref resolve
(org-babel-sh-var-to-sh (org-babel-ref-resolve "pointless-sep-hline") "--" "++")

[2/3] Questions

  • [ ] Is this quoting shell specific?
  • [X]

    Does Emacs provide a function for this already?

    It seems not.

  • [X]

    How do :hlines and :separator work? Do they even work?

    Yes, but not well. See General obshell: Questions.

[0/2] Refactor

  • [ ]

    Rename

    Quoting is fiddly and this exact process appears necessary for several different shells. It makes sense to define it once to avoid needing to make multiple updates or need to debug multiple functions.

  • [ ] Document arguments

DONE defun org-babel–variable-assignments:sh-generic

(defun org-babel--variable-assignments:sh-generic
    (varname values &optional sep hline)
  "Return a list of statements declaring the values as a generic variable."
  (format "%s=%s" varname (org-babel-sh-var-to-sh values sep hline)))

Create a string like "%s=%s" for use in creating variables. See Shell Parameters.

[0/2] Questions

  • [ ]

    What is meant by "generic variable"?

    According to the docstring, "generic" doesn't appear to mean "across" shells. Instead, it means something else. The word "generic" only shows up twice in the Bash manual and only then in the GNU Free Documentation License.

    If this is intended for use across shells, does that work? Looking at you, Powershell (if that's even supported).

  • [ ]

    Should variables be first 'unset'?

    'Unset' is called first for org-babel--variable-assignments:bash_assoc. Why is it not called here?

[0/2] Refactor

  • [ ]

    Remove sep and hline arguments

    See arguments elsewhere.

  • [ ] Document arguments

DONE defun org-babel–variable-assignments:basharray

(defun org-babel--variable-assignments:bash_array
    (varname values &optional sep hline)
  "Return a list of statements declaring the values as a bash array."
  (format "unset %s\ndeclare -a %s=( %s )"
          varname varname
          (mapconcat
           (lambda (value) (org-babel-sh-var-to-sh value sep hline))
           values
           " ")))

Create a string for defining Bash indexed arrays.

Indexed arrays are lists of values accessed by their position in the list. See Arrays.

unset is a Bash builtin to destroy arrays. Use declare -a myarray to create a new array. There are two ways to define elements of an array. One of them is myarray[index]=value. Another is NAME=(VALUE1 VALUE2 ... ) where each VALUE may be of the form '[SUBSCRIPT]='STRING.

Given a 'varname' for the array name and a list of pairs, construct an the string:

unset VARNAME
declare -a VARNAME=( VARNAME[key1]=value1 VARNAME[key2]=value2 ... VARNAME[keyN]=valueN )

This function is called when using the :var header argument.

echo ${myfriends[2]} is a friend of mine

[0/1] Questions

  • [ ] Does this work with named references, like a table?

[0/1] Refactor

  • [ ] Document arguments

DONE defun org-babel–variable-assignments:bashassoc

(defun org-babel--variable-assignments:bash_assoc
    (varname values &optional sep hline)
  "Return a list of statements declaring the values as bash associative array."
  (format "unset %s\ndeclare -A %s\n%s"
          varname varname
          (mapconcat
           (lambda (items)
             (format "%s[%s]=%s"
                     varname
                     (org-babel-sh-var-to-sh (car items) sep hline)
                     (org-babel-sh-var-to-sh (cdr items) sep hline)))
           values
           "\n")))

Create a string for constructing an associative array in Bash syntax.

Bash supports "associative" arrays. Associative arrays are key-value stores (sometimes called "dictionaries"). See Arrays.

unset is a Bash builtin to destroy arrays. Use declare -A myarray to create a new array. There are two ways to define elements of an array. One of them is myarray[key]=value.

Given a 'varname' for the array name and a list of pairs, construct an the string:

unset VARNAME
declare -A VARNAME
VARNAME[key1]=value1
VARNAME[key2]=value2
...
VARNAME[keyN]=valueN

This function is called when using the :var header argument.

echo I have ${myarray[apple]} apples

[0/3] Questions

  • [ ]

    What are downsides to off-loading this kind of functionality to the user?

    The problem: for every syntax, for every shell, we need to create a command that maps to various shell constructs. For example, defining associative arrays uses different syntax in different shells:

    mksh$ typeset -a may[Germany]=Berlin
    mksh$ echo ${may[Germany]}
    Berlin
    

    I don't think there's a way around needing to construct a string that's sent to the subprocess. So, for each construct, we'll need helper functions like org-babel--variable-assignments:bash_assoc.

    This could be done 1:1, like with org-babel--variable-assignments:bash_assoc. That is for the shell functionality, there is a separate function to handle it. A concern is that as these accrue, someone will come along and say, "Hey, this syntax is all very similar. Why don't we 'simplify' the code and generate all the similar cases?" Then we'll be in the same situation as org-babel-shell-initialize where simple tasks become complicated code generators that have hidden dependencies and extra code needs to exist to handle the edge cases.

    By "off-loading" to the end-user, I mean, provide an API that makes this straight forward for end-users to implement. For example, something like hook functions.

  • [ ]

    How is this function used? Why is it provided in the first place?

    At first glance, this seems convenient. However, why does it exist? Why can't the user "simply" write a block that does this?

    Without looking into it, my guess is to convert an Org table to an array.

  • [ ]

    How are 'sep' and 'hline' supposed to work with this function?

    As far as I understand how org-babel-sh-var-to-sh works, these make no sense.

[0/2] Refactor

  • [ ]

    Document the parameters

    'varname' is what to name the array. 'values' is a list of pairs

  • [ ] The worg documentation is wrong on this topic

DONE defun org-babel–variable-assignments:bash

(defun org-babel--variable-assignments:bash (varname values &optional sep hline)
  "Represent the parameters as useful Bash shell variables."
  (pcase values
    (`((,_ ,_ . ,_) . ,_)               ;two-dimensional array
     (org-babel--variable-assignments:bash_assoc varname values sep hline))
    (`(,_ . ,_)                         ;simple list
     (org-babel--variable-assignments:bash_array varname values sep hline))
    (_                                  ;scalar value
     (org-babel--variable-assignments:sh-generic varname values sep hline))))

Call appropriate assignment function:

  • org-babel–variable-assignments:bashassoc
  • org-babel–variable-assignments:basharray
  • org-babel–variable-assignments:sh-generic

Uses pcase which does pattern matching on the form of a value and performs an action accordingly.

[1/2] Questions

  • [X]

    Is pcase the best way to do this?

    pcase is a little weird to read at first. To match the nested list appears needing to understand cons cells. This could be a problem for someone not familiar with them. It also requires quasi-quote. So, that's two levels of knowledge needed.

    • [X]

      Can it be used with a different pattern?

      The pattern doesn't make sense to me, too. The pattern is this:

      `((,_ ,_ . ,_) . ,_)
      

      Why not use a simpler form?

      `((,_ . ,_) . ,_)
      

      Or maybe one that's literally the form we want?

      `((,_ . (,_ . nil)) . ,_)
      

      As cons cells, a list of pairs looks like:

      '((3 . (4 . nil)) . nil)
      

      or with a second pair:

      '((3 . (4 . nil)) . ((5 . (6 . nil))))
      

      It looks like it's combining the first cons together.

      '((3 4 . nil) . nil)
      

      or with a second pair:

      '((3 4 . nil) . ((5 6 . nil))))
      

      Honestly, I'm getting annoyed at parsing this out. Why are we using cons cells here anyway? It seems like we need to use them in order capture the "rest" of the pairs.

      So why not something like this since we're really only checking that the first element is a pair?

      `((,_ ,_) . ,_)
      

      It works:

      (let ((values '(("apple" 3) ("banana" 4) ("cherry" 5))))
        (pcase values
          (`((,_ ,_) . ,_)
           (message "Simpler form catches"))
          (`((,_ ,_ . ,_) . ,_)
           (message "Simple form failed"))))
      
      (let ((values '("alice" "bob" "carol")))
        (pcase values
          (`((,_ ,_) . ,_)
           (message "Simpler form catches"))
          (`((,_ ,_ . ,_) . ,_)
           (message "Simple form failed"))
          (_ (message "Neither form catches (as desired)"))))
      
      (let ((values 42))
        (pcase values
          (`((,_ ,_) . ,_)
           (message "Simpler form catches"))
          (`((,_ ,_ . ,_) . ,_)
           (message "Simple form failed"))
          (_ (message "Neither form catches (as desired)"))))
      
    • [X]

      How could this same task be done without pcase?

      Clearly, pcase can get confusing and fiddly. It seems to me a good idea if we could avoid using it all together.

      We're using the form of the parameters to determine the action. Why? Because the parameters derive from the :var header which can take an arbitrary form.

      Have I been calling :var "incorrectly"?

      It seems like :var is expected to be called multiple times! The docstring for org-babel--get-vars says that PARAMS "may contain multiple entries for the key :var ".

      (defun org-babel--get-vars (params)
        "Return the babel variable assignments in PARAMS.
      
      PARAMS is a quasi-alist of header args, which may contain
      multiple entries for the key `:var'.  This function returns a
      list of the cdr of all the `:var' entries."
        (mapcar #'cdr
                (cl-remove-if-not (lambda (x) (eq (car x) :var)) params)))
      

      If you're not careful, it looks like this should work:

      echo I have ${myarray["banana"]} bananas
      

      But, NO, actually it doesn't…apple is never assigned.

      echo $myarray
      echo I have ${myarray["banana"]} bananas and ${myarray["apple"]} apples
      

      What happens is this:

      Each :var is evaluated. The varname comes through as something that Bash should recognize, "myarray\[\"apple\"\]". However, the value is just "3", so the pcase sees it as an atom and uses org-babel--variable-assignments:sh-generic. This means bash doesn't make the "define" call necessary to create an array.

      Inneed, through this and through the use of pcase, the expectation appears to be that a lisp expression is passed to :var. So, :var can be called with any form.

      How else might we differentiate forms?

      The params coming into org-babel--get-vars look like:

      ((:results . "output replace")
      (:exports . "code")
      (:var . "myarray[\"apple\"]=3")
      (:var . "myarray[\"banana\"]=42")
      (:tangle . "no")
      (:hlines . "no")
      (:noweb . "no")
      (:cache . "no")
      (:session . "none"))
      

      Rather than check the values parameter, check the varname. That is check for array syntax. Should we be parsing for Bash code?

      This is the weeds. What does it look like stepping back?

      We're wanting to use the :var header to define variables in a Bash shell. In addition to regular variables, we want to define indexed arrays and associative arrays. How might an end user indicate that these need to be defined? Currently, they can give a lisp expression that gets translated. How else might an end user indicate and provide the necessary information for defining an indexed array or associative array?

      • Separate headers for indexed and associative arrays

      ACTUALLY… :var is intended to work with Org tables and lists (the following don't work here because of the indentation, but it does work).

      #+name:fruit
      | apple  |  3 |
      | banana | 42 |
      
      #+begin_src bash :results output :var myarray=fruit :eval never-export
      echo I have ${myarray["banana"]} bananas and ${myarray["apple"]} apples
      #+end_src
      
      #+RESULTS:
      : I have 42 bananas and 3 apples
      
  • [ ]

    Should this function exist?

    It's only delegating to other functions and is only called by org-babel-variable-assignments:shell.

    The reason why it's broken out looks like it may be because org-babel-variable-assignments:shell gets a list of params that is then mapped using a lambda. Breaking the delegation to a separate function makes the lambda code shorter.

    I'm curious if restructuring its logic would make it easier to understand and remove the need for these separate functions.

[0/3] Refactor

  • [ ]

    Change docstring

    It's not very clear. It doesn't follow Emacs Lisp Documentation Tips:

    For a function, the first line should briefly answer the question, "What does this function do?"

    The first line should mention all the important arguments of the function, and should mention them in the order that they are written in a function call.

    What parameters? What does represent mean? Why?

    It takes a variable name and…

    How should we describe "values"? It's not a list because it could be an atom.

    It takes a variable name and the corresponding value (or values). Variable name must be a string and a valid Bash variable name. Values must be a list of pairs, a list, or an atom.

  • [ ]

    Reword comments

    • Regarding "scalar":

    It's not for a "scalar", it's for an atom, since we're checking the form of values, not what values represents in Bash (and Bash doesn't have scalars). Further, the value of the variable is not necessarily a number:

    Y="six";
    eval echo I have "$X" apples
    

    And still yet, the underscore matches anything.

    • Regarding "two-dimensional array":

    The comment is misleading. It does not mean to say that "bashassoc" is a 2D-array, that associative arrays are 2D. They're not. What it means to say is that the form of values is a "list of lists". That values is a 2D list.

    It's confusing because bashassoc makes a Bash array. So, it's easy to confuse what this is talking about (especially if you're not familiar with Bash details).

    Further, Lisp lists are not arrays! So, it's technically incorrect.

    (let ((values '(("apple" 3) ("banana" 4) ("cherry" 5))))
      (if (arrayp values)
          (message "The list is an array")
        (message "Lists are not arrays")))
    
  • [ ]

    Create better worg documentation for :var

    First, the worg documentation is misleading. :var is intended for use mainly with named tables and lists. By happenstance (it appears), it also works with lisp expressions.

    This is how :var is expected to be used:

    apple 5
    banana 7
    echo I have ${myarray["banana"]} bananas and ${myarray["apple"]} apples
    

    And this works with arrays as expected.

    Berlin
    Washington DC
    echo ${myarray[0]} is the first capital by alphabetical order
    echo ${myarray[1]} comes last in alphabetical order
    

DONE defun org-babel–variable-assignments:fish

(defun org-babel--variable-assignments:fish
    (varname values &optional sep hline)
  "Return a list of statements declaring the values as a fish variable."
  (format "set %s %s" varname (org-babel-sh-var-to-sh values sep hline)))

This is basically the same function as org-babel--variable-assignments but for fish's syntax.

[0/0] Questions

[0/1] Refactor

  • [ ] Update docstring to explain the arguments

DONE defun org-babel-variable-assignments:shell

(defun org-babel-variable-assignments:shell (params)
  "Return list of shell statements assigning the block's variables."
  (let ((sep (cdr (assq :separator params)))
        (hline (when (string= "yes" (cdr (assq :hlines params)))
                 (or (cdr (assq :hline-string params))
                     "hline"))))
    (mapcar
     (lambda (pair)
       (if (string-suffix-p "bash" shell-file-name)
           (org-babel--variable-assignments:bash
            (car pair) (cdr pair) sep hline)
         (if (string-suffix-p "fish" shell-file-name)
             (org-babel--variable-assignments:fish
              (car pair) (cdr pair) sep hline)
           (org-babel--variable-assignments:sh-generic
            (car pair) (cdr pair) sep hline))))
     (org-babel--get-vars params))))

Call appropriate variable assignment function depending on shell type (i.e. suffix is bash or not). Each element of a :var header arg is processed.

[1/1] Questions

[0/1] Refactor

  • [ ] Change nested ifs to a single cond

DONE defun org-babel-prep-session:shell

(defun org-babel-prep-session:shell (session params)
  "Prepare SESSION according to the header arguments specified in PARAMS."
  (let* ((session (org-babel-sh-initiate-session session))
         (var-lines (org-babel-variable-assignments:shell params)))
    (org-babel-comint-in-buffer session
      (mapc (lambda (var)
              (insert var) (comint-send-input nil t)
              (org-babel-comint-wait-for-output session))
            var-lines))
    session))

Cruft left over from the original implementation of ob-sh. It was used to start and return the comint associated with the session.

Originally, the comint associated with a session was actually started by a function (that no longer exists) called org-babel-sh-initiate-session-by-key. The org-babel-sh-initiate-session returned only the buffer after it was created:

;; from commit 0fa68a53
(defvar org-babel-sh-buffers '(:default . nil))

(defun org-babel-sh-session-buffer (session)
  (cdr (assoc session org-babel-sh-buffers)))

(defun org-babel-sh-initiate-session-by-key (&optional session)
  "If there is not a current inferior-process-buffer in SESSION
then create.  Return the initialized session."
  (save-window-excursion
    (let* ((session (if session (intern session) :default))
           (sh-buffer (org-babel-sh-session-buffer session))
           (newp (not (org-babel-comint-buffer-livep sh-buffer))))
      (if (and sh-buffer (get-buffer sh-buffer) (not (buffer-live-p sh-buffer)))
          (setq sh-buffer nil))
      (shell sh-buffer)
      (when newp
        (setq sh-buffer (current-buffer))
        (org-babel-comint-wait-for-output sh-buffer))
      (setq org-babel-sh-buffers (cons (cons session sh-buffer)
                                       (assq-delete-all session org-babel-sh-buffers)))
      session)))

(defun org-babel-sh-initiate-session (&optional session)
  (unless (string= session "none")
    (org-babel-sh-session-buffer (org-babel-sh-initiate-session-by-key session))))

Over time, the functionality of creating the comint process was moved into org-babel-sh-initiate-session. Variable assignment was moved from org-babel-prep-session:sh to org-babel-sh-variable-assignments in 8e151c06 October 14, 2010. It has been untouched since then (for 13 years).

As you can see, org-babel-prep-session duplicates that of org-babel-sh-initiate-session-by-key (and the modern org-babel-sh-initiate-session). That is, it starts a comint.

[0/0] Questions

[0/1] Refactor

  • [ ]

    Remove this function

    See Refactor section of org-babel-load-session.

DONE defun org-babel-load-session:shell

(defun org-babel-load-session:shell (session body params)
  "Load BODY into SESSION."
  (save-window-excursion
    (let ((buffer (org-babel-prep-session:shell session params)))
      (with-current-buffer buffer
        (goto-char (process-mark (get-buffer-process (current-buffer))))
        (insert (org-babel-chomp body)))
      buffer)))

This is cruft from the original development. I highly doubt it's been used in the 13 years since it was last changed. Everything this function does has been reimplemented in other functions.

It starts a process if none exists and inserts the body into the associated process buffer. Initiate a comint, if needed. Define any variables that are given in the header args. Return buffer associated with the session.

It's intended for use with the (largely undocumented) org-metaup.

[1/1] Questions

  • [X]

    What is this used for?

    Searching for all instances of "org-babel-load-session:shell", we see that it's used in ob-core.

    rg -n --type=elisp -e org-babel-load-session\: /tmp/org-mode | rg -v defun
    

    This is within the org-babel-load-in-session function.

    rg -n --type=elisp -e org-babel-load-in-session /tmp/org-mode | rg -v defun
    

    This is used by org-babel-load-in-session-maybe which is added as a hook to the org-metaup-hook.

    rg -n --type=elisp -e org-metaup-hook /tmp/org-mode
    

    It looks like org-metaup-hook is only set up (by default) with org-babel-load-in-session-maybe (which calls org-babel-load-in-session).

    The only other place this stuff is used in code is in org-keys (seen above in the rg output). The org-keys.el commentary says,

    ;; This library adds bindings for Org mode buffers. It also ;; implements both Speed keys and Babel speed keys. See manual for ;; details.

    Looking at the manual, indeed, org-babel-load-in-session is mentioned in Key bindings and Useful Functions:

       Active key bindings in code blocks:
    
    Key binding    Function
    --------------------------------------------------------
    'C-c C-c'      'org-babel-execute-src-block'
    'C-c C-o'      'org-babel-open-src-block-result'
    'M-<UP>'       'org-babel-load-in-session'
    'M-<DOWN>'     'org-babel-pop-to-session'
    
    ...
    
    'C-c C-v l' or 'C-c C-v C-l'     'org-babel-load-in-session'
    

    No mention of org-babel-load-in-session is given related to the speed commands. Speed commands creates a shortcut for org-metaup, but does not explain (or in any way connect) that org-babel-load-in-session exists.

    Back to the original question, what is this used for?

    org-babel-load-session:shell is connected to an undocumented feature. It's unclear what this feature is. Try calling M-x org-metaup on the following block:

    echo "hi"
    echo "world"
    

    It starts a session and takes you to it:

    ahab@pequod ~/Projects/excalamus.com/src/draft/emacs$ PROMPT_COMMAND=;PS1="org_babel_sh_prompt> ";PS2=
    org_babel_sh_prompt> echo "hi"
    echo "world"
    

[0/1] Refactor

  • [ ]

    Remove all traces of org-babel-load-session

    I would be shocked if this feature has been used by anyone in the past 13 years. It requires a largely undocumented feature: org-metaup.

    The function of it is to take you to the comint buffer. Something like that might be useful if Org could capture the input and output. However, it doesn't. This defeats the whole point of literate programming.

    Unless there's a way to capture the input and output of session interaction happening outside the Org block, I don't see a point in keeping this.

    We're also in the fortunate position of no one likely knowing about this, making it easy for us to kill it.

DONE defun org-babel-sh-strip-weird-long-prompt

(defun org-babel-sh-strip-weird-long-prompt (string)
  "Remove prompt cruft from a string of shell output."
  (while (string-match "^% +[\r\n$]+ *" string)
    (setq string (substring string (match-end 0))))
  string)

Used only in org-babel-sh-evaluate and only for sessions. Removes leading white space for strings starting with "%".

The name concerns me. It sounds an awful lot like the problem it tries to solve wasn't fully understood. That is, it's a work around, not a solution.

Introduced in commit 9623b169.

[3/3] Questions

  • [X]

    What is the actual regexp?

    The regexp is:

    "^% +[\r\n$]+ *"
    

    This says,

    • ^ starting at the beginning of the line
    • % match the percent sign
    • + match one or more spaces
    • [\r\n$]+ match one or more line endings
    • * match zero or more white spaces
  • [X]

    Does this regexp actually do anything?

    I don't think so.

    Here's how it works:

    (let ((string "% \n\n\r\n\rhello world\ngood cruel world x_x"))
      (while (string-match "^% +[\r\n$]+ *" string)
        (setq string (substring string (match-end 0))))
      string)
    

    It will not match unless the string starts with a "%":

    (let ((string "\n\n\r\n\rhello world\ngood cruel world x_x"))
      (while (string-match "^% +[\r\n$]+ *" string)
        (setq string (substring string (match-end 0))))
      string)
    

    The only place org-babel-sh-strip-weird-long-prompt is ever called is in org-babel-sh-evaluate. The captured session text is passed in.

    For the following, #("The value of x is 7" 0 19 (field output)) is the string passed in:

    x=7
    echo "The value of x is $x"
    

    I assume the "%" corresponded to the prompt, but I can't tell. Currently, the string passed in is the result of the (several times) trimmed result of evaluation.

    ;; org-babel-sh-evaluate
      (mapconcat
       #'org-babel-sh-strip-weird-long-prompt
       (mapcar
        #'org-trim
        (butlast ; Remove eoe indicator
         (org-babel-comint-with-output
             (session org-babel-sh-eoe-output t body)
           (insert (org-trim body) "\n"
                   org-babel-sh-eoe-indicator)
           (comint-send-input nil t))
         ;; Remove `org-babel-sh-eoe-indicator' output line.
         1))
       "\n")))
    
  • [X]

    Is it necessary to setq?

    The while loop continues through the string in order to find all matches. When a match is found, the string is replaced. In this way, each iteration of the loop is checking a new string. This allows for (match-end 0) to always be used for the next match.

    Yes, the setq is necessary if the goal is to return a new string.

[0/1] Refactor

  • [ ]

    Remove this function

    I'm leaning toward removing this. It doesn't seem like it's called anymore. I have no clue how to reproduce the problem it seems to solve and there's no record I could find of what the problem even was. It also appears like the original problem was never fully understood.

    If we remove it and a bug shows up, then we'll have a real test case, have a chance to figure out what's actually happening, and maybe find a better solution. We could remove the element of "weirdness." It would negatively affect users in the short term by outputting extra characters. The output would still exist, however, and users would be able to keep using the Babel. In the long term, we'd be able to better support users by actually understanding our code base.

    By removing it, we avoid:

    1. Having code
    2. Not understanding our code
    3. Doing extra processing

DONE defun ob-shell-async-chunk-callback

(defun ob-shell-async-chunk-callback (string)
  "Filter applied to results before insertion.
See `org-babel-comint-async-chunk-callback'."
  (replace-regexp-in-string comint-prompt-regexp "" string))

As the docstring says, the filter applied to results before insertion. Replace the comint-prompt-regexp ("orgbabelshprompt> *") with nothing.

Used in org-babel-sh-evaluate. Provided as the CHUNK-CALLBACK argument of org-babel-comint-async-register. Called after the ob-shell-async-indicator is found and the results scraped from the comint buffer.

[0/0] Questions

[0/1] Refactor

  • [ ] Document argument

DONE [10/10] defun org-babel-sh-evaluate

This is where all the functionality of ob-shell lives. Understand this and you'll have the bulk of what ob-shell is.

(defun org-babel-sh-evaluate (session body &optional params stdin cmdline)
  "Pass BODY to the Shell process in BUFFER.
If RESULT-TYPE equals `output' then return a list of the outputs
of the statements in BODY, if RESULT-TYPE equals `value' then
return the value of the last statement in BODY."
  (let* ((shebang (cdr (assq :shebang params)))
         (async (org-babel-comint-use-async params))
         (results-params (cdr (assq :result-params params)))
         (value-is-exit-status
          (or (and
               (equal '("replace") results-params)
               (not org-babel-shell-results-defaults-to-output))
              (member "value" results-params)))
         (results
          (cond
           ((or stdin cmdline)         ; external shell script w/STDIN
            (let ((script-file (org-babel-temp-file "sh-script-"))
                  (stdin-file (org-babel-temp-file "sh-stdin-"))
                  (padline (not (string= "no" (cdr (assq :padline params))))))
              (with-temp-file script-file
                (when shebang (insert shebang "\n"))
                (when padline (insert "\n"))
                (insert body))
              (set-file-modes script-file #o755)
              (with-temp-file stdin-file (insert (or stdin "")))
              (with-temp-buffer
                (with-connection-local-variables
                 (apply #'process-file
                        (if shebang (file-local-name script-file)
                          shell-file-name)
                        stdin-file
                        (current-buffer)
                        nil
                        (if shebang (when cmdline (list cmdline))
                          (list shell-command-switch
                                (concat (file-local-name script-file)  " " cmdline)))))
                (buffer-string))))
           (session                     ; session evaluation
            (if async
                (progn
                  (let ((uuid (org-id-uuid)))
                    (org-babel-comint-async-register
                     session
                     (current-buffer)
                     "ob_comint_async_shell_\\(.+\\)_\\(.+\\)"
                     'ob-shell-async-chunk-callback
                     nil)
                    (org-babel-comint-async-delete-dangling-and-eval
                        session
                      (insert (format ob-shell-async-indicator "start" uuid))
                      (comint-send-input nil t)
                      (insert (org-trim body))
                      (comint-send-input nil t)
                      (insert (format ob-shell-async-indicator "end" uuid))
                      (comint-send-input nil t))
                    uuid))
              (mapconcat
               #'org-babel-sh-strip-weird-long-prompt
               (mapcar
                #'org-trim
                (butlast ; Remove eoe indicator
                 (org-babel-comint-with-output
                     (session org-babel-sh-eoe-output t body)
                   (insert (org-trim body) "\n"
                           org-babel-sh-eoe-indicator)
                   (comint-send-input nil t))
                 ;; Remove `org-babel-sh-eoe-indicator' output line.
                 1))
               "\n")))
           ;; External shell script, with or without a predefined
           ;; shebang.
           ((org-string-nw-p shebang)
            (let ((script-file (org-babel-temp-file "sh-script-"))
                  (padline (not (equal "no" (cdr (assq :padline params))))))
              (with-temp-file script-file
                (insert shebang "\n")
                (when padline (insert "\n"))
                (insert body))
              (set-file-modes script-file #o755)
              (if (file-remote-p script-file)
                  ;; Run remote script using its local path as COMMAND.
                  ;; The remote execution is ensured by setting
                  ;; correct `default-directory'.
                  (let ((default-directory (file-name-directory script-file)))
                    (org-babel-eval (file-local-name script-file) ""))
                (org-babel-eval script-file ""))))
           (t (org-babel-eval shell-file-name (org-trim body))))))
    (when (and results value-is-exit-status)
      (setq results (car (reverse (split-string results "\n" t)))))
    (when results
      (let ((result-params (cdr (assq :result-params params))))
        (org-babel-result-cond result-params
          results
          (let ((tmp-file (org-babel-temp-file "sh-")))
            (with-temp-file tmp-file (insert results))
            (org-babel-import-elisp-from-file tmp-file)))))))

The org-babel-sh-evaluate function has two parts, calculating the results and formatting the results. Calculating the results makes up most of the function. It's a conditional for each case where each case corresponds to possible header arguments. Formatting is more-or-less off-loaded to core Org functionality.

There are several ways in which a shell block may evaluate. Many of these overlap and influence each other. Each, more-or-less, corresponds to a case in the conditional.

These are the current cases:

  1. stdin
  2. cmdline
  3. shebang
  4. session
  5. async
  6. general
  • [X]

    defun and org-babel-sh-evaluate-varlist-shebang

    (defun org-babel-sh-evaluate (session body &optional params stdin cmdline)
      "Pass BODY to the Shell process in BUFFER.
    If RESULT-TYPE equals `output' then return a list of the outputs
    of the statements in BODY, if RESULT-TYPE equals `value' then
    return the value of the last statement in BODY."
      (let* ((shebang (cdr (assq :shebang params)))
    

    The arguments are several parameters (session, stdin, cmdline, async, etc) and the source block body. For some reason, the parameters are split between being explicitly passed and passed within the params list. See Questions and Refactor for discussion.

    Params is an alist of data. condition See "What is params?" in defun org-babel-execute:shell for more information.

    A shebang is a line like "#!/bin/bash" which does two things:

    1. Allows a script file to execute as a command
    2. Specifies which binary to use when executing

    This let statement binds the :shebang header argument.

  • [X]

    org-babel-sh-evaluate-varlist-async

    (async (org-babel-comint-use-async params))
    

    Determine whether to use async.

    Async depends on 1) :session not being "none" and 2) the :async header not being "no". See Questions section below on toggle (in)consistencies. org-babel-use-async moves this logic to a separate module for other async implementations to use.

  • [X]

    org-babel-sh-evaluate-varlist-results-params

    (results-params (cdr (assq :result-params params)))
    

    The :result-params key in params holds the header values for the :results header.

    It looks something like,

    (:result-params "replace" "output")
    

    or

    (:result-params "replace" "value" "table")
    

    Results are presentable in several formats and :result-params holds the keywords for deciding what the output should look like.

  • [X]

    org-babel-sh-evaluate-varlist-value-is-exit-status

    (value-is-exit-status
     (or (and
          (equal '("replace") results-params)
          (not org-babel-shell-results-defaults-to-output))
         (member "value" results-params)))
    

    Block results are stdout, stderr, or an exit code. Typically, an exit code of 0 means success, non-0 means failure. value-is-exit-status helps figure out which result to return.

    This is something that needs refactoring. See Refactoring notes.

  • [X]

    org-babel-sh-evaluate-varlist-results

    (results
     (cond
    

    Results are calculated using a cond. Various combinations of headers make up the different cases. See Question asking whether all combinations are represented.

  • [X]

    org-babel-sh-evaluate-varlist-results-stdin-cmdline

    ((or stdin cmdline)            ; external shell script w/STDIN
     (let ((script-file (org-babel-temp-file "sh-script-"))
           (stdin-file (org-babel-temp-file "sh-stdin-"))
           (padline (not (string= "no" (cdr (assq :padline params))))))
       (with-temp-file script-file
         (when shebang (insert shebang "\n"))
         (when padline (insert "\n"))
         (insert body))
       (set-file-modes script-file #o755)
       (with-temp-file stdin-file (insert (or stdin "")))
       (with-temp-buffer
         (with-connection-local-variables
          (apply #'process-file
                 (if shebang (file-local-name script-file)
                   shell-file-name)
                 stdin-file
                 (current-buffer)
                 nil
                 (if shebang (when cmdline (list cmdline))
                   (list shell-command-switch
                         (concat (file-local-name script-file)  " " cmdline)))))
         (buffer-string))))
    

    This condition handles the stdin, cmdline, and (sometimes) shebangs headers. Note: there's a separate branch condition for shebang that's only called when there's no stdin or cmdline header. So, this is for:

    stdin cmdline stdin, shebang cmdline, shebang stdin, cmdline, shebang

    A shebang is a line at the top of a file like "#!/bin/bash" which allows a shell script to run as a command. The name "shebang" comes from combining the names of the first two characters, the "hash" (#) and the "bang" (!).

    It works by calling process-file which has form:

    (process-file PROGRAM &optional INFILE BUFFER DISPLAY &rest ARGS)
    

    The stdin header is put into a temporary file whose path is passed in as INFILE.

    The block code is also put into a temporary file. The block code is either executed as a shebang or called as part of the argument to the shell binary. The shebang line is inserted when a :shebang is given.

    The temporary script file is made executable with 755 permissions. This corresponds to chmod a+rwx,o-w which sets permissions so that the group and owner can read, can write and can execute. Others can read and execute, but can't write.

    When the block script is executed, it's run using process-file in a temporary buffer. The results of the process are placed in the buffer and, finally, the contents of the buffer given for the results.

    For some reason, a newline is inserted if :padline is given. See Questions.

  • [X]

    org-babel-sh-evaluate-varlist-results-session

    (session                        ; session evaluation
     (if async
         (progn
           (let ((uuid (org-id-uuid)))
             (org-babel-comint-async-register
              session
              (current-buffer)
              "ob_comint_async_shell_\\(.+\\)_\\(.+\\)"
              'ob-shell-async-chunk-callback
              nil)
             (org-babel-comint-async-delete-dangling-and-eval
                 session
               (insert (format ob-shell-async-indicator "start" uuid))
               (comint-send-input nil t)
               (insert (org-trim body))
               (comint-send-input nil t)
               (insert (format ob-shell-async-indicator "end" uuid))
               (comint-send-input nil t))
             uuid))
       (mapconcat
        #'org-babel-sh-strip-weird-long-prompt
        (mapcar
         #'org-trim
         (butlast ; Remove eoe indicator
          (org-babel-comint-with-output
              (session org-babel-sh-eoe-output t body)
            (insert (org-trim body) "\n"
                    org-babel-sh-eoe-indicator)
            (comint-send-input nil t))
          ;; Remove `org-babel-sh-eoe-indicator' output line.
          1))
        "\n")))
    

    The "session" condition has two parts: async and non-async.

    Async handling comes first. The async API requires "registering" certain information: the session buffer, the org buffer the block lives in, a regexp for matching the beginning and end of evaluation, and a user-defined filter function. The registration process adds a special (non-user-defined) filter function to the session buffer (org-babel-comint-async-filter). This filter looks at output sent to the session buffer from the process for the end of evaluation regex.

    After the registration is made, the "start" indicator is sent, the block code sent, and finally the "end" indicator. Because this is all sent to a shell session, it runs in the background until the filter finds the "end" indicator. Until that happens, the result of the async branch is the uuid and that's what appears for the user as the result. (If the "end" indicator is found, the uuid is located and replaced with the text occurring between the "start" and "end" indicators. But that's all handled by things in ob-comint.el.)

    Non-async sessions are handled by the next bit of code. Basically, it sends the block code to the session and waits for the org-babel-sh-eoe-indicator. How this happens is determined by code in ob-comint.el.

  • [X]

    org-babel-sh-evaluate-varlist-results-shebang

    ;; External shell script, with or without a predefined
    ;; shebang.
    ((org-string-nw-p shebang)
     (let ((script-file (org-babel-temp-file "sh-script-"))
           (padline (not (equal "no" (cdr (assq :padline params))))))
       (with-temp-file script-file
         (insert shebang "\n")
         (when padline (insert "\n"))
         (insert body))
       (set-file-modes script-file #o755)
       (if (file-remote-p script-file)
           ;; Run remote script using its local path as COMMAND.
           ;; The remote execution is ensured by setting
           ;; correct `default-directory'.
           (let ((default-directory (file-name-directory script-file)))
             (org-babel-eval (file-local-name script-file) ""))
         (org-babel-eval script-file ""))))
    

    Check if shebang is non-empty; check for padline Write shebang and body to temp file

    The temporary script file is made executable with 755 permissions. This corresponds to chmod a+rwx,o-w which sets permissions so that the group and owner can read, can write and can execute. Others can read and execute, but can't write.

    Check if the script file is remote. This is because, ultimately, process-file is called. Two things must happen:

    1. strip the remote information from the filename
    2. make sure the file runs on the remote

    process-file says to use file-local-name for remote files. This strips the remote information from the filename. However, without this information, process-file will execute the file on the local machine. We change this behavior by setting the default-directory to make the file relative to the remote machine.

    For example, when processing a remote file, the script-file variable looks like:

    "/ssh:ahab@localhost:/tmp/sh-script-3qPNNc"

    The file-local-name strips the "ssh" stuff off the front to make it: /tmp/sh-script-3qPNNc. Originally, the file was relative to "/ssh:ahab@localhost:/tmp/". After file-local-name, it looks like it's relative to "/tmp" (on the local machine).

    org-babel-eval then tries to run the result of file-local-name. Without setting the default-directory, it uses the current default-directory. This is the the local default-directory which is "/tmp". When this happens, the script can't be found. We must set the default-directory to the remote (/ssh:ahab@localhost:/tmp/). Then it's found.

  • [X]

    org-babel-sh-evaluate-varlist-results-general

    (t (org-babel-eval shell-file-name (org-trim body))))))
    

    This handles any of the cases not covered.

    It trims the body and runs it through process-file. See "How Shell works at a deep level" in the Mailing list items.

  • [X]

    org-babel-sh-evaluate-varlist-return

    (when (and results value-is-exit-status)
      (setq results (car (reverse (split-string results "\n" t)))))
    (when results
      (let ((result-params (cdr (assq :result-params params))))
        (org-babel-result-cond result-params
          results
          (let ((tmp-file (org-babel-temp-file "sh-")))
            (with-temp-file tmp-file (insert results))
            (org-babel-import-elisp-from-file tmp-file)))))))
    

    Shells can return two types of values, exit codes and standard output.

    The Babel results are the exit code when using ":results value":

    echo "hi"
    

    The results look like:

    "hi
    0
    "
    
    (car (reverse (split-string "hi\n0" "\n" t)))
    

    The final branch takes the result-params (which control output formatting) and

    echo "hello"
    echo "world"
    

    org-table-result-cond handles the different result cases. For a table, it needs a sexp. It gets this from org-babel-import-elisp-from-file:

    (("hello") ("world"))
    

    This is what's returned to the caller. Why? That's not exactly clear. But there you have it.

[3/17] Questions

  • [ ] Why does this function exist?
  • [ ]

    What combinations exist and are they represented?

    Possible headers:

    • stdin
    • cmdline
    • stdin
    • session
    • async
    • shebang
    • default
    ;; org-sort
    ;; delete-duplicate-lines
    (defun powerset (x)
    (if (null x)
        (list '())
        (let ((elt (car x))
              (p (power-set (cdr x))))
          (append (mapc (lambda (s) (cons elt s)) p) p))))
    
    (powerset
     '(stdin
       cmdline
       stdin
       session
       async
       shebang
       ))
    
  • [ ]

    What is the use of running a block as a shebang?

    The sole purpose (AFAIU) of a shebang is to run a script as a command. The purpose of Babel is for literate programming. Does literate programming have any overlap with running a script as a command?

    A command is a way to execute code. In Babel, executing a block is the way to execute the code. A shebang, AFAIU, doesn't have any present inherent change in execution. The shebang header does affect output within Babel because of implementation details. However, that's not a part of shebangs themselves.

    What are the different was to execute a script?

    1. run as a command (shebang)
    2. passing the script as an argument (sh my-script.sh)
    3. passing it as a command string (bash -c "code stuff")
    4. sourcing it

    What are the differences between them?

    AFAIU, (1) and (2) are the same. (3) doesn't work well with multiple lines. (4) modifies the current environment.

    What am I missing? Because (1) and (2) are the same, it seems to me that the shebang option could be deprecated and removed.

  • [X]

    Is a block actually run as a shebang/command?

    Yes. The call made is:

    (apply #'process-file
         (if shebang (file-local-name script-file)
           shell-file-name)
         ...
    

    If the :shebang header is specified, then the temporary script file name is passed in as the PROGRAM argument for process-file.

  • [ ]

    What is required to make the toggles for options consistent?

    Async depends on 1) :sessions not being "none" and 2) the :async header not being "no". The following is run async:

    echo "hello"
    sleep 3
    echo "world"
    

    I think ob-python uses "yes". What are the toggles for other languages?

  • [ ]

    How does the logic for value-is-exit-status work?

    There was big discussion in 2020 about the meaning of output and value and about which should be default (if any).

    See https://orgmode.org/list/CA+A2iZaziAfMeGpBqL6qGrzrWEVvLvC0DUw++T4gCF3NGuW-DQ@mail.gmail.com

    At that time, in commit 07c6b1125, the current logic was put in place:

    -        (value-is-exit-status (or (equal '("replace") results-params)
    -                                  (member "value" results-params)))
    +        (value-is-exit-status
    +         (or (and
    +              (equal '("replace") results-params)
    +              (not org-babel-shell-results-defaults-to-output))
    +             (member "value" results-params)))
    

    It says,

    If "replace" and not default to output -> t
    Elseif value appears -> t
    Else nil
    
  • [ ]

    Why only check "default to output" for "replace" in value-is-exit-status?

    What is org-babel-shell-results-defaults-to-output?

    org-babel-shell-results-defaults-to-output is a variable defined in 'ob-shell.el'.

    Let shell execution defaults to ":results output".

    When set to t, use ":results output" when no :results setting is set. This is especially useful for inline source blocks.

    When set to nil, stick to the convention of using :results value as the default setting when no :results is set, the "value" of a shell execution being its exit code.

    What is "replace?"

    'replace' Default. Insert results in the Org buffer. Remove previous results. Usage example: ':results output replace'.

    We want "value" to correspond to "exit code" and "output" to correspond to "stdout".

    The default Org Babel behavior is "value". However, "value" is not a good default for ob-shell. So, the setting org-babel-shell-results-defaults-to-output is set to t, making "output" the default.

    So, when org-babel-shell-results-defaults-to-output is nil, results should be "value."

    Why is "replace" checked at all? It's not clear to me, but the results aren't right if it's removed. The logic for this needs to be cleaned up…

  • [ ]

    How does with-connection-local-variables work?

    Within the stdin/cmdline branch, what does with-connection-local-variables do?

    It applies "connection-local-variables", yes. I suppose those are set elsewhere?

  • [ ]

    Should other branches use with-connection-local-variables?

    Might this help with some of the remote file handling? Or is this just a convenience? Should other branches get the same convenience?

  • [ ]

    When stdin/cmdline branch called without shebang, does shell-command-switch always work?

    It looks like shell-command-switch is always set to "-c". This is bash specific.

    shell-command-switch is only called in this branch. So, it's being used for all shells. This should be set in the initialize closure for each shell (if such a flag exists).

  • [X]

    Is a check done on the :shebang?

    No. It appears to just run and fail silently.

    On Android, at least, "Code block produced no output."

    #+begin_src sh :shebang #!/banana :cmdline first second
    echo "$2"
    #+end_src
    
    #+RESULTS:
    

    Note stdin or cmdline must be given in order for the stdin/cmdline branch to apply.

  • [ ]

    Are the file permissions appropriate?

    For Stdin/Cmdline, they are made executable with 755 permissions. This corresponds to chmod a+rwx,o-w which sets permissions so that the group and owner can read, can write and can execute. Others can read and execute, but can't write.

  • [ ]

    Why is :padline checked in stdin/cmdline?

    The manual says,

    The 'padline' header argument controls insertion of newlines to pad source code in the tangled file.

    So, why is there any handling for padlines here?

    The padline is only used after inserting the shebang. Is a newline sometimes (but not always) needed after a shebang?

  • [ ]

    Why does the async user define the uuid and call the start/end indicators?

    This is really a question for ob-comint.el.

  • [ ]

    Why does the shebang branch check if the script file is remote?

    Is it for when the Org buffer/file is being run on a remote machine?

    Are there other places this should be done?

  • [ ]

    For shebang branch, could the remote file checking logic clean up?

    Maybe.

    This logic was introduced in 39de4a18:

    commit 39de4a1848d12b1be929853bf884ec04e121d9f0 (HEAD) Author: Ihor Radchenko <yantar92@posteo.net> Date: Sun Jul 23 17:38:09 2023 +0300

    org-babel-sh-evaluate: Fix edge case when :shebang is set and :dir is remote

    • lisp/ob-shell.el (org-babel-sh-evaluate): Pass remote local file

    name as command when executing script with :shebang. `org-babel-eval' will fail when SCRIPT-FILE is TRAMP file name.

    Link: https://www.reddit.com/r/orgmode/comments/14zh2yi/orgbabel_shebang_not_working_on_with_tramp_dir/

    The following is a function I made that probably isn't necessary.

    (defun my-file-name-directory (filename)
      "Like `file-name-directory' but works with remote files."
      (interactive)
      (let* ((remote-connection (file-remote-p filename))
             (end (if remote-connection
                      (match-end
                       (string-match remote-connection filename)))))
        (file-name-directory (substring filename end))))
    
    (my-file-name-directory
     ;; "/ssh:ahab@localhost:~/"
     "/tmp/org-mode/lisp/ox.el"
     ;; "~/delete-me"
     )
    
  • [X]

    Is org-babel-import-elisp-from-file necessary for results?

    No, there are other ways. I was not able to find anything as simple as "convert-results-to-table(txt)". Everything seems to require gymnastics.

    org-babel-import-elisp-from-file requires writing to disk simply to convert a the results into a sexp.

    For the following,

    echo "hello"
    echo "world"
    

    the "results" look like,

    "hello
    world
    "
    

    It's not clear to me if there's a function to take the results string and convert it to a table. There are several functions that kind of do something like that,

    • org-table-convert-region
    • org-table-create-or-convert-from-region
    • the whole org-babel-import-elisp-from-file dance ob-shell does

    The following works:

    (with-temp-buffer
      (insert "hello\nworld\n")
      (org-table-convert-region (point-min) (point-max))
      (buffer-substring-no-properties (point-min) (point-max)))
    

    Notice that it does the transformation in place and therefore requires manually grabbing the result and returning it.

    Maybe this should be a function?

    Fun fact, there are two "create-table-or-convert-from-region" functions in org-table.el:

    (defun org-table-create-or-convert-from-region (arg)
      "Convert region to table, or create an empty table.
    If there is an active region, convert it to a table, using the function
    `org-table-convert-region'.  See the documentation of that function
    to learn how the prefix argument is interpreted to determine the field
    separator.
    If there is no such region, create an empty table with `org-table-create'."
      (interactive "P")
      (if (org-region-active-p)
          (org-table-convert-region (region-beginning) (region-end) arg)
        (org-table-create arg)))
    
    (defun orgtbl-create-or-convert-from-region (_arg)
      "Create table or convert region to table, if no conflicting binding.
    This installs the table binding `C-c |', but only if there is no
    conflicting binding to this key outside `orgtbl-mode'."
      (interactive "P")
      (let* (orgtbl-mode (cmd (key-binding "\C-c|")))
        (if cmd
          (call-interactively cmd)
          (call-interactively 'org-table-create-or-convert-from-region))))
    

    It seems like this might be because there are different table types?

    ;;; Commentary:

    ;; Watch out: Here we are talking about two different kind of tables. ;; Most of the code is for the tables created with the Org mode table editor. ;; Sometimes, we talk about tables created and edited with the table.el ;; Emacs package. We call the former org-type tables, and the latter ;; table.el-type tables.

    It feels like this is typical for the Org code base. Functions take weird arguments and work in ways that seem based simply on the functions that were available, regardless of whether it made any sense. They're not designed to be composable. So, you end up needing to do a bunch of weird transformations and understand the mechanics of how the underlying system works. Basically, there's no abstraction.

    My impression is that this makes maintenance harder. For example, I want to focus on shell functionality. My brain is filled with the Chapter 18 level of kernel driver books dealing with terminal I/O and here I am having to also deal with table rendering when there's no fundamental reason why it can't be a function that takes a string. That makes me grumpy.

    This makes it harder to get maintainers because maintaining one section is not possible. You need a deep understanding of each part of the code base.

[0/7] Refactor

  • [ ]

    Get rid of this function

    It makes no sense. The necessary variables (session, cmdline, async, etc.) are randomly grabbed within this function and it's caller. The caller, org-babel-execute:shell is the one needed by the Org Babel API. Both functions do similar things, like prep for the type of execution and preparing the results. There is no clear reason for either to exist. It's not like one does the subprocess call and the other assembles results. No! Both do a bit of that.

  • [ ] Document the arguments
  • [ ]

    Make the arguments consistent

    For some reason the arguments are inconsistent. Some are provided directly (session, stdin, and cmdline), others through the params. Both of the direct ones come from params, yet are obtained in the caller function, org-babel-execute:shell, and are only used to call org-babel-sh-evaluate.

  • [ ]

    Fix value-is-exit-status

    This is part of the "Get rid of this function" argument.

    Basically,

    1. Don't split the functionality of getting the exit code between two functions
    2. Don't use only "echo $?" to get the exit status (what about "unsupported" shells?)

    value-is-exit-status is a toggle to return the exit code. It works like this:

    A block is executed and org-babel-execute:shell is called. value-is-exit-status is determined. When it's true, "\necho $?" is appended to the block body. Then org-babel-sh-evaluate is executed. value-is-exit-status is again figured out. When it's true, then the result is parsed for the last value (which is the exit code).

    Maybe I need to think more, but I don't see why this needs to be split between two functions.

    Also, the exit code is obtained using "\necho $?". I'm pretty sure that's only valid for Bourne-like shells (I have not checked).

  • [ ]

    Fix docstring for org-babel-shell-results-defaults-to-output

    Says "Let shell execution defaults to ":results output"."

    Should be something like

    "Let shell execution default to ":results output".

    or

    "Set default execution to ":results output""

  • [ ] Make the paths for stdin, cmdline, and shebang more clear
  • [ ]

    Move functionality of org-babel-sh-evaluate-varlist-return elsewhere

    The evaluation should happen separately from the formatting of the results of the evaluation. Right now, the logic for formatting is split between two functions (and that logic appears unnecessarily complex).

DONE defun org-babel-execute:shell

(defun org-babel-execute:shell (body params)
  "Execute Shell BODY according to PARAMS.
This function is called by `org-babel-execute-src-block'."
  (let* ((session (org-babel-sh-initiate-session
                   (cdr (assq :session params))))
         (stdin (let ((stdin (cdr (assq :stdin params))))
                  (when stdin (org-babel-sh-var-to-string
                               (org-babel-ref-resolve stdin)))))
         (results-params (cdr (assq :result-params params)))
         (value-is-exit-status
          (or (and
               (equal '("replace") results-params)
               (not org-babel-shell-results-defaults-to-output))
              (member "value" results-params)))
         (cmdline (cdr (assq :cmdline params)))
         (full-body (concat
                     (org-babel-expand-body:generic
                      body params (org-babel-variable-assignments:shell params))
                     (when value-is-exit-status "\necho $?"))))
    (org-babel-reassemble-table
     (org-babel-sh-evaluate session full-body params stdin cmdline)
     (org-babel-pick-name
      (cdr (assq :colname-names params)) (cdr (assq :colnames params)))
     (org-babel-pick-name
      (cdr (assq :rowname-names params)) (cdr (assq :rownames params))))))

The fundamental entry point for ob-shell when a source block is evaluated.

When a user executes a block, the ob-core function org-babel-execute-src-block is called. It dynamically binds "org-babel-execute:lang" to a variable "cmd" where "lang" corresponds to whatever follows "#+beginsrc" in the block header ("lang" corresponds to what we've been calling "template").

;; org-babel-execute-src-block

...
           d))))
     (cmd (intern (concat "org-babel-execute:" lang)))  ; dynamically bind symbol
     result exec-start-time)
...
(setq exec-start-time (current-time)
      result
      (let ((r (save-current-buffer (funcall cmd body params))))  ; call org-babel-execute:lang
        (if (and (eq (cdr (assq :result-type params)) 'value)
...

Recall that org-babel-shell-initialize generated functions called "org-babel-execute:template", one for each of the shells given in org-babel-shell-names. Each of these calls org-babel-execute:shell.

BODY is the inside of the source block. PARAMS is an alist (see below) of header arguments and other information.

This is function doesn't have a clear purpose. It just happens to be the entry point.

It parses the block header and passes some of these to is the eventual execution function org-babel-sh-evaluate. However, it also passes the full params which is re-parsed out again.

It sort of handles assembling results. For example, it reassembles the table. However, some of that is also done in org-babel-sh-evaluate.

[1/3] Questions

  • [ ] The org-babel-execute:template functions and org-babel-execute:shell are all related to dispatch. How might we dispatch differently and are any of the alternatives better?
  • [X]

    Why does the docstring say "This function is called by `org-babel-execute-src-block'."?

    It seems to me that it says this because it's trying to explain how the Org Babel API functions. Unfortunately, how the Org Babel API functions is rather opaque.

  • [ ]

    What is params?

    Params is a data structure whose specific structure depends on the context. PARAMS, as seen within org-babel-execute:shell, stores information taken from the source block header as an alist:

    ;; What's stored in the 'params' parameter:
    ((:colname-names)
     (:rowname-names)
     (:result-params "replace")
     (:result-type . value)
     (:results . "replace")
     (:exports . "code")
     (:session . "none")
     (:cache . "no")
     (:noweb . "no")
     (:hlines . "no")
     (:tangle . "no"))
    
    ;; Typical access pattern:
    (cdr (assq :results
               '((:colname-names)
                 (:rowname-names)
                 (:result-params "replace")
                 (:result-type . value)
                 (:results . "replace")
                 (:exports . "code")
                 (:session . "none")
                 (:cache . "no")
                 (:noweb . "no")
                 (:hlines . "no")
                 (:tangle . "no"))
               ))  ; "replace"
    

    Note that PARAMS is defined differently elsewhere! Some of these docstring descriptions, such as for org-babel-variable-assignments, are probably incorrect in calling it a plist.

    According to org-babel--get-vars,

    PARAMS is a quasi-alist of header args, which may contain multiple entries for the key `:var'. This function returns a list of the cdr of all the `:var' entries."

    According to org-babel--file-desc,

    PARAMS is header argument values.

    According to org-list--to-generic-plain-list,

    PARAMS is a plist used to tweak the behavior of the transcoder.

    According to org-babel-tangle--unbracketed-link,

    The PARAMS are the 3rd element of the info for the same src block.

    According to org-babel-comint-use-async,

    PARAMS are the header arguments as passed to `org-babel-execute:lang'.

    According to org-babel-variable-assignments:plantuml,

    PARAMS is a property list of source block parameters, which may contain multiple entries for the key `:var'. `:var' entries in PARAMS are expected to be scalar variables."

[0/2] Refactor

  • [ ] Remove the statement "This function is called by `org-babel-execute-src-block'." from the docstring. Although a true statement, it's also true that this function is called by all of the org-babel-execute:template functions.
  • [ ] Give this function a clear division of responsiblity

DONE defun org-babel-shell-initialize

(defun org-babel-shell-initialize ()
  "Define execution functions associated to shell names.
This function has to be called whenever `org-babel-shell-names'
is modified outside the Customize interface."
  (interactive)
  (dolist (name org-babel-shell-names)
    (let ((fname (intern (concat "org-babel-execute:" name))))
      (defalias fname
        (lambda (body params)
          (:documentation
           (format "Execute a block of %s commands with Babel." name))
          (let ((shell-file-name name))
            (org-babel-execute:shell body params))))
      (put fname 'definition-name 'org-babel-shell-initialize))
    (defalias (intern (concat "org-babel-variable-assignments:" name))
      #'org-babel-variable-assignments:shell
      (format "Return list of %s statements assigning to the block's \
variables."
              name))
    (funcall (if (fboundp 'defvar-1) #'defvar-1 #'set) ;Emacs-29
             (intern (concat "org-babel-default-header-args:" name))
             nil)))

Several different shells are supported.

Each needs the following functions:

  1. An alias org-babel-execute:template for org-babel-execute:shell
  2. An alias org-babel-variable-assignments:name for org-babel-variable-assignments:shell
  3. org-babel-default-header-args:name

These allow people to use any of the org-babel-shell-names as the source block language. For example, the ob-core functionality that runs the source block (org-babel-execute-src-block) expects a "org-babel-execute:template" function.

The way it works is this. It iterates through the org-babel-shell-names list. For each element, it defines a symbol for org-babel-execute:template according to the current list item (name). It makes this symbol an alias for a function that calls org-babel-execute:shell such that shell-file-name is assigned to the current list item. For example, when processing "fish", a call to org-babel-execute:fish will call org-babel-execute:shell yet any call to an inferior shell process will use "fish" rather than the shell stored in the SHELL environment variable at start up. (It's assumed that the command, such as "fish", is accessible through PATH). A "put" call makes org-babel-execute:template accessible from the built in help. It next creates an alias for org-babel-variable-assignments:name corresponding to org-babel-variable-assignments:shell. Finally, it creates a symbol for org-babel-default-header-args:name, using a different function (either set or defvar-1) depending on the Emacs version.

[1/1] Questions

  • [X]

    Is there a better way to do this?

    This sets off my alarms because it feels too clever.

    • probably complex for someone coming from a non-lispy background, creates a closure to modify shell-file-name, meta-programming
    • creates (what feels like) unnecessary dependencies between org-babel-shell-names and org-babel-shell-initialize
    • hints at being an end-user function when it shouldn't be used that way
    • assumes that the things it's automating are all the same

    tl;dr Make org-babel-shell-initialize take a list parameter, rather than operating on org-babel-shell-names as a side effect.

    It has immediate problems when learning the codebase, as written about above for org-babel-shell-names. It's feels convoluted to me. Understanding org-babel-shell-names requires knowing how org-babel-shell-initialize works. org-babel-initialize forces org-babel-shell-names to be defined twice.

    It seems like this function is trying to provide end-users with an easy way to update ob-shell with a new shell during run-time. That's cool and all…but that should be the job of the ob-shell maintainer. It's not simple supporting different shells, even if they're similar. What benefit is there in giving end-users a half-baked way to hang themselves?

    Further, I doubt many end-users are using ob-shell with a custom shell not supported. There's one exception: cmd.exe on Windows. However, adding "cmd.exe" or "cmdproxy.exe" to org-babel-shell-names doesn't really work well. Nor is it necessary: you can just use "shell".

    I think it's the automation part that concerns me most. It's the fear of "What if there's an exception that doesn't fit into the automation". My non-fear brain says, "Has that been a problem since this was implemented? How hard would it be to change it when a problem arises that requires such a change?" Leaving the implementation as is costs nothing (but conceptual energy). Re-working it will likely be easier later, if and when, a problem appears since there will be a clear goal.

    Part of me thinks it would be better to define org-babel-shell-names once, populated with all the shells and then to manually calling org-babel-shell-initialize to define the execution functions, rather than forward declare org-babel-shell-names and using the defcustom call to trigger the initialization.

    Maybe make org-babel-shell-initialize take a parameter which is a list of shells to initialize? In this way, it would break the dependency on org-babel-shell-names. Then we could declare org-babel-shell-names once and have it call the initialize function.

    1. Make org-babel-shell-initialize take a parameter which a list of shells to initialize
    2. Define the list org-babel-shell-names
    3. Set org-babel-shell-names to call org-babel-shell-initialize when set.
      • Can defcustom refer to itself during the set call? Yes! It seems so.

[0/3] Refactor

  • [ ] Make org-babel-shell-initialize private. No one but the maintainer should be running that.
  • [ ] Document the interaction between org-babel-shell-initialize and org-babel-shell-names
  • [ ] Make org-babel-shell-initialize functional
    1. Make org-babel-shell-initialize take a parameter which is a list of shells to initialize
    2. Define the list org-babel-shell-names using defcustom
    3. Set org-babel-shell-names to call org-babel-shell-initialize when set.

      For example,

      (defun my-initialize (initialization-list)
        (interactive)
        (dolist (name initialization-list)
          (message "%s" (concat "my-automatically-created-symbol:" name))))
      
      (defcustom my-name-list
        '("banana" "Rama")
        "List of names to be created by `my-initialize'"
        :group 'my-test
        :type '(repeat (string :tag "Name to create: "))
        :set (lambda (symbol value)
               (set-default-toplevel-value symbol value)
               (my-initialize my-name-list)))
      

ob-shell undocumented behaviors

Problem: Undocumented functionality

Some of ob-shell.el is dedicated to undocumented behavior. The org-babel-load-in-session function which is bound to the org-metaup-hook, M-up. The source body will be inserted in a new session process buffer.

For this is the code path (M-up):

  • ob-core:org-babel-load-in-session
  • ob-shell:org-babel-prep-session:shell

Now documented, yet a good example of this, is the :dir header. There have been many questions on the mailing list regarding shell blocks and ssh. The :dir header works with TRAMP to create a remote connection. This works seamlessly with Babel. However, it was not clearly documented (the only example showed an R code block and is located in a separate part of the manual).

Addendum

History

In the early days of Babel, collaboration notes were kept. Eric makes a distinction between (what he calls) "functional" and "imperative". It's not clear to me what's meant by these terms. AFAICT, "functional" means "non-persistent" and "imperative" means "persistent". "persistent" means here that when you run a block twice, you always get the same result with a "functional" block.

See:

2024-01-06

Powered by peut-publier

©2024 Excalamus.com