local CGI

(reload) (page class:public)
I often am frustrated by the crappiness of UI APIs in the scripting languages I have available. But I have found writing web-apps in PHP to be mostly quite pleasant, despite the fact I hate HTML. Obviously I can't reasonably run all my scripts on Dreamhost's systems, some things only make sense here. But to run CGI programs on my own system needs a web server. I'm fairly sure I tried webfsd for this in the past (which I still sometimes use for quickly transferring files across our LAN, though it has... issues...), but IIRC its CGI support didn't work for me. Meh. I think I successfully used CGI on Boa years ago, but I uninstalled that because it ran as a system-wide webserver, which I didn't really want.

Lots of people have made "small, lightweight fast simple web server written entirely in bla bla bla", with little obvious benefits/reasons over the 5000 other ones. I may or may not join them. However if I do, what I basically want is an httpd that:

existing small webservers


There's tons of the buggers. But frankly many of them look comparable in size to Apache, which I'm presuming is the canonical "big" webserver. (Actually "Apache common" package seems to be fairly big, but it's still not gargantuan).
Actually, on further inspection it seems much of the reported "installed size" of Debian packages is probably from directories and subdirectories- even when those directories are shared with other packages, eg /usr/ etc, because they still have to declare those. And of course there's still things like documentation (I should bloody hope). So many of these may be a lot smaller than that figure would suggest. Anyway, a list of ones I found, by no means exhaustive:

Note some of my "requirements" when applied to pre-existing servers, are more like preferences/prejudices. The main thing is simply that I want something to run only on a per-user basis as needed, rather than permanently system-wide on port 80, I want it secure, and I want to be able to run my own bespoke web-apps on it. Smallness is a virtue but it needn't be the absolute smallest. Minimal featureset is a virtue for avoiding configuration complexity and worries about whether X Y and Z work.

Other things

Increasingly random stuff TODO: move some of these project links elsewhere

TODO: Find out how SCGI works as that sounds good. And how inetd interfacing works, that confuses the crap out of me.
TODO: decide if that todo above should be using todo tags. ouch

more options found since

TODO: also add those ones I have in my bookmarks from ages ago.

SSI


Looks like it's pretty much a commodity standard language rather than something very server-specific. Server_Side_Includes article at Wikipedia summarises it, it looks very simple so probably manageable to make support for.
I think the main thing needed to support it, would be a lexer that can detect what part of a document (content, tag, quoted string in a tag, character following a \ char, or character entity) different characters are in, in order to determine correctly what is an SSI tag. The rest is probably very easy.
If possible, I would extend it to be able to specify $_GET and $_POST elements as parameters to pass to programs it calls (as an alternative to using the exec cgi directive; why would you put exec cgi in an SSI when you could just use a CGI to begin with?! Actually I can think of reasons yes...). This would make it pretty damn useful for quick+dirty coding. BUT those parameters would have to be escaped for the shell, which the interpreter really should do automatically.

Meh. By the looks of things, the Wikipedia page is only good for a vague summary, and a better reference would be the one from Apache, which also lists the ability to set variables. And I think the features I'd want from a mini server-embedded language would be different to what SSI gives probably. Mostly those GET and POST elements, and simple database access, apart from some of the same things it does give, but also a spec that I can be more sure of (seems a little vague in places, and there's questions I don't see answered). I think it shouldn't have to do anything too obscure, if I want that I'll use CGI.

Other CGI ideas

See the CGI/1.1 standard doc, though it is quite old. It doesn't seem especially hard although it looks like it shifts some of the burden of parsing GET and POST requests to the CGI script.

Could make a simple CGI-programming language using Flex or Ragel, and CGILib. It could be like a tiny little alternative to PHP, with similar access to $_GET and $_POST etc. Built-in functions for accessing SQLite databases, Berkeley DB databases, some other kinds too maybe, and plain fixed-width record flatfile databases. Oh and Ming I guess, as SWF is so handy. Perhaps it could make use of GNU Lightning for doing meta-programming, particularly as this might make it practical to have a set of extensions written in itself.

As mentioned above, an SSI type implementation in the server itself might be a worthwhile supplement to full-on CGI programming, to let it do simple stuff quite fast (no forking). It'd presumably need a parser written (again) with Ragel or something.

RECENT BRAINWAVE: A very nice simple way of handling a new bookmarky thingy webapp, would be that Firefox supports bookmarklets and keymarks, which can be like little snippets of Javascript embedded in bookmarks, and callable by doing things like entering "gi potato" to do a google images search for "potato", when "gi" is a keyword given to a bookmarklet. Except the "gi" example substitutes text into a normal URL, and a more bookmarkletty bookmarklet substitutes it into a piece of javascript code. (actually here I got keymarks and bookmarklets a bit mixed up. A situation exacerbated by the fact the stupid bastards don't really say much about them, let alone use the term "bookmarklet" in their knowlege base)

Maybe a bookmarklet or similar could be made to send the current URL, encoded, to a script on the locally-running CGI app server thingy, in order to do external bookmarks. That shoud probably be quite easy Javascript, although the fact of how to call it (typing into the addressbar sounds cumbersome, probably no easier than copy-pasting it elsewhere by hand.). (idea has since been explored in bookmarklets and keymarks)

Web-server side of it


A web sever primarily for CGI, still has to implement a reasonable amount of HTTPish stuff even if it's not going to do much of what a traditional server would.
HTTP/1.0 is defined in RFC 1945
HTTP/1.1 is defined in RFC 2616

I should probably aim somewhere in the middle, and maybe send error messages for features that aren't supported?? I'm not sure. I should read the blasted specs first.

Also important to see RFC 2119 which is about requirement definitions in RFCs in general, and RFC 2145, "Use and Interpretation of HTTP Version Numbers".

HTTPS


HTTPS is not something I can use on Dreamhost, because it's not really compatible with virtual hosting and they need you to pay for a dedicated IP server before you can get it. However it might be usable on a local system like this one I'm describing. There's various standards it can be built upon, such as SSLv2, SSLv3, and TLS (which has various versions too). SSLv2 is apparently quite obsolete, so TLS is probably the appropriate way to go. Or that "stunnel" program maybe??? I'm not sure about whether the latter would get in the way.
TLS/1.0 is covered by RFC 2246
I don't have the link for 1.1 etc etc, 1.0 is old.

Also see CAcert who do free certificates, would be useful with this (but still not with Dreamhost).

POST data


See RFC 1867, "Form-based File Upload in HTML", which as far as I can see, explains how the POST method is actually done a lot more clearly than the HTTP specs themselves :P

Apart from that it's sorta handy to see how uploads are implemented too, although I doubt I'd add that feature to the server really. As it's meant to run locally, and only really has the network socket to communicate with a web browser. Files can be sent either as URLs or names in the filesystem. However, it might wind up used on a LAN somewhere, and the protocol doesn't look so hard to receive. Question of how to deal with the files sent remains meh, and do I resend the rest of the items to CGIs as normal POSTdata?

Anyway: From both that RFC, and the PHP docs on file uploads, uploads use an encoding type of "multipart/form-data". However, according to both that RFC and my experiments with Dillo talking to a socket, normal POST data from forms uses "application/x-www-form-urlencoded". Example, I got:

POST /pants.cgi HTTP/1.0
Host: localhost
User-Agent: Dillo/0.8.0-pre
Cookie2: $Version="1"
Content-type: application/x-www-form-urlencoded
Content-length: 19

field=put+text+here
NOTE, that there was NOT a trailing newline after "here", it just ended. That is part of the spec, there should only be as many characters as stated in Content-length, no extra newlines or crap added to it.

See also, networking stuff, lcgid implementation



Page source

Warning:Only I can edit Mwuki!