Warning, /maui/mauikit-documents/src/code/epub/qhttpserver/http-parser/README.md is written in an unsupported language. File is not indexed.
0001 HTTP Parser 0002 =========== 0003 0004 This is a parser for HTTP messages written in C. It parses both requests and 0005 responses. The parser is designed to be used in performance HTTP 0006 applications. It does not make any syscalls nor allocations, it does not 0007 buffer data, it can be interrupted at anytime. Depending on your 0008 architecture, it only requires about 40 bytes of data per message 0009 stream (in a web server that is per connection). 0010 0011 Features: 0012 0013 * No dependencies 0014 * Handles persistent streams (keep-alive). 0015 * Decodes chunked encoding. 0016 * Upgrade support 0017 * Defends against buffer overflow attacks. 0018 0019 The parser extracts the following information from HTTP messages: 0020 0021 * Header fields and values 0022 * Content-Length 0023 * Request method 0024 * Response status code 0025 * Transfer-Encoding 0026 * HTTP version 0027 * Request URL 0028 * Message body 0029 0030 0031 Usage 0032 ----- 0033 0034 One `http_parser` object is used per TCP connection. Initialize the struct 0035 using `http_parser_init()` and set the callbacks. That might look something 0036 like this for a request parser: 0037 0038 http_parser_settings settings; 0039 settings.on_url = my_url_callback; 0040 settings.on_header_field = my_header_field_callback; 0041 /* ... */ 0042 0043 http_parser *parser = malloc(sizeof(http_parser)); 0044 http_parser_init(parser, HTTP_REQUEST); 0045 parser->data = my_socket; 0046 0047 When data is received on the socket execute the parser and check for errors. 0048 0049 size_t len = 80*1024, nparsed; 0050 char buf[len]; 0051 ssize_t recved; 0052 0053 recved = recv(fd, buf, len, 0); 0054 0055 if (recved < 0) { 0056 /* Handle error. */ 0057 } 0058 0059 /* Start up / continue the parser. 0060 * Note we pass recved==0 to signal that EOF has been recieved. 0061 */ 0062 nparsed = http_parser_execute(parser, &settings, buf, recved); 0063 0064 if (parser->upgrade) { 0065 /* handle new protocol */ 0066 } else if (nparsed != recved) { 0067 /* Handle error. Usually just close the connection. */ 0068 } 0069 0070 HTTP needs to know where the end of the stream is. For example, sometimes 0071 servers send responses without Content-Length and expect the client to 0072 consume input (for the body) until EOF. To tell http_parser about EOF, give 0073 `0` as the forth parameter to `http_parser_execute()`. Callbacks and errors 0074 can still be encountered during an EOF, so one must still be prepared 0075 to receive them. 0076 0077 Scalar valued message information such as `status_code`, `method`, and the 0078 HTTP version are stored in the parser structure. This data is only 0079 temporally stored in `http_parser` and gets reset on each new message. If 0080 this information is needed later, copy it out of the structure during the 0081 `headers_complete` callback. 0082 0083 The parser decodes the transfer-encoding for both requests and responses 0084 transparently. That is, a chunked encoding is decoded before being sent to 0085 the on_body callback. 0086 0087 0088 The Special Problem of Upgrade 0089 ------------------------------ 0090 0091 HTTP supports upgrading the connection to a different protocol. An 0092 increasingly common example of this is the Web Socket protocol which sends 0093 a request like 0094 0095 GET /demo HTTP/1.1 0096 Upgrade: WebSocket 0097 Connection: Upgrade 0098 Host: example.com 0099 Origin: http://example.com 0100 WebSocket-Protocol: sample 0101 0102 followed by non-HTTP data. 0103 0104 (See http://tools.ietf.org/html/draft-hixie-thewebsocketprotocol-75 for more 0105 information the Web Socket protocol.) 0106 0107 To support this, the parser will treat this as a normal HTTP message without a 0108 body. Issuing both on_headers_complete and on_message_complete callbacks. However 0109 http_parser_execute() will stop parsing at the end of the headers and return. 0110 0111 The user is expected to check if `parser->upgrade` has been set to 1 after 0112 `http_parser_execute()` returns. Non-HTTP data begins at the buffer supplied 0113 offset by the return value of `http_parser_execute()`. 0114 0115 0116 Callbacks 0117 --------- 0118 0119 During the `http_parser_execute()` call, the callbacks set in 0120 `http_parser_settings` will be executed. The parser maintains state and 0121 never looks behind, so buffering the data is not necessary. If you need to 0122 save certain data for later usage, you can do that from the callbacks. 0123 0124 There are two types of callbacks: 0125 0126 * notification `typedef int (*http_cb) (http_parser*);` 0127 Callbacks: on_message_begin, on_headers_complete, on_message_complete. 0128 * data `typedef int (*http_data_cb) (http_parser*, const char *at, size_t length);` 0129 Callbacks: (requests only) on_uri, 0130 (common) on_header_field, on_header_value, on_body; 0131 0132 Callbacks must return 0 on success. Returning a non-zero value indicates 0133 error to the parser, making it exit immediately. 0134 0135 In case you parse HTTP message in chunks (i.e. `read()` request line 0136 from socket, parse, read half headers, parse, etc) your data callbacks 0137 may be called more than once. Http-parser guarantees that data pointer is only 0138 valid for the lifetime of callback. You can also `read()` into a heap allocated 0139 buffer to avoid copying memory around if this fits your application. 0140 0141 Reading headers may be a tricky task if you read/parse headers partially. 0142 Basically, you need to remember whether last header callback was field or value 0143 and apply following logic: 0144 0145 (on_header_field and on_header_value shortened to on_h_*) 0146 ------------------------ ------------ -------------------------------------------- 0147 | State (prev. callback) | Callback | Description/action | 0148 ------------------------ ------------ -------------------------------------------- 0149 | nothing (first call) | on_h_field | Allocate new buffer and copy callback data | 0150 | | | into it | 0151 ------------------------ ------------ -------------------------------------------- 0152 | value | on_h_field | New header started. | 0153 | | | Copy current name,value buffers to headers | 0154 | | | list and allocate new buffer for new name | 0155 ------------------------ ------------ -------------------------------------------- 0156 | field | on_h_field | Previous name continues. Reallocate name | 0157 | | | buffer and append callback data to it | 0158 ------------------------ ------------ -------------------------------------------- 0159 | field | on_h_value | Value for current header started. Allocate | 0160 | | | new buffer and copy callback data to it | 0161 ------------------------ ------------ -------------------------------------------- 0162 | value | on_h_value | Value continues. Reallocate value buffer | 0163 | | | and append callback data to it | 0164 ------------------------ ------------ -------------------------------------------- 0165 0166 0167 Parsing URLs 0168 ------------ 0169 0170 A simplistic zero-copy URL parser is provided as `http_parser_parse_url()`. 0171 Users of this library may wish to use it to parse URLs constructed from 0172 consecutive `on_url` callbacks. 0173 0174 See examples of reading in headers: 0175 0176 * [partial example](http://gist.github.com/155877) in C 0177 * [from http-parser tests](http://github.com/joyent/http-parser/blob/37a0ff8/test.c#L403) in C 0178 * [from Node library](http://github.com/joyent/node/blob/842eaf4/src/http.js#L284) in Javascript