Saturday, June 3, 2017

THE VAULT : The #! magic, details about the shebang/hash-bang mechanism on various Unix flavours

  • Fancy code

    • 2.8BSD implemented the test for the #! magic with a multi character constant
       #define SCRMAG '#!'
    • Demos (originally based on 2.9 BSD) inherited SCRMAG, and even added its own multi character constant for a variant of the magic:
       # define SCRMAG2 '/*#!'
       # define ARGPLACE "$*"
      Find more information in the end notes [Demos].
    • BSD/OS (2.0, sys/i386/i386/exec_machdep.c) shows a readable way to construct the magic
       [...]
       switch (magic) {
       /* interpreters (note byte order dependency) */
       case '#' | '!' << 8:
        handler = exec_interpreter;
        break;
       case [...]
  • POSIX.2 or SUSv2 / SUSv3 / SUSv4 mention #! only as a possible extension:
        Shell Introduction
        [...]
        If the first line of a file of shell commands starts with the
        characters #!, the results are unspecified.
    
        The construct #! is reserved for implementations wishing to provide
        that extension. A portable application cannot use #! as the first
        line of a shell script; it might not be interpreted as a comment.
        [...]
    
        Command Search and Execution
        [...]
        This description requires that the shell can execute shell
        scripts directly, even if the underlying system does not support
        the common #! interpreter convention. That is, if file foo contains
        shell commands and is executable, the following will execute foo:
    
          ./foo 
    There was a Working Group Resolution trying to define the mechanism.
    On the other hand, speaking about "#!/bin/sh" on any Unix:
     This is a really rocksolid and portable convention by tradition, if you expect anything from the Bourne shell family and its descendants to be called.

  • what's special about #!#! was a great hack to make scripts look and feel like real executable binaries.
    But, as a little summary, what's special about #!? (list mostly courtesy of David Korn)

    • the interpretername must not contain blanks
    • the length of the #! is much smaller than the maximum path length
    • $PATH is not searched for the interpreter
       (apart from an absolute path, the #! line also accepts a relative path,
       and #!interpreter is equivalent to #!./interpreter,
      however, it's not of any practical use)
    • the interpreter usually must no be a #! script again
    • the handling of arguments in the #! line itself is varying
    • the setuid mechanism may or may not be available for the script
    • there's no way to express #!$SHELL
  • Possible errors:
    • If the interpreter is not found, the system returns ENOENT.This error can be misleading, because many shells then print the script name instead of the interpreter in its #! line:
       $cat script.sh
       #!/bin/notexistent
       $ ./script.sh
       ./script.sh: not found 
      bash since release 3 subsequently itself reads the first line and gives a diagnostic concerning the interpreter
       bash: ./script.sh: /bin/notexistent: bad interpreter: No such file or directory
    • If the #! line is too long, at least three things can happen:
      • The line is truncated, usually to the maximum length allowed.
      • The system returns E2BIG (IRIX, SCO OpenServer) or ENAMETOOLONG (FreeBSD, BIG-IP4.2, BSD/OS4.1)
        and you get something like "Arg list too long" / "Arg list or environment too large" or "File name too long", respectively.
      • The kernel refuses to execute the file and returns ENOEXEC. In some shells this results in a silent failure.
        Other shells subsequently try to interprete the script itself.

No comments: