llhttp State Machine Diagram

This document visualizes the state machine encoded in the llhttp HTTP parser.

High-Level Overview

stateDiagram-v2
    [*] --> start

    start --> load_type : non_whitespace
    start --> start : CR_LF_skip

    load_type --> start_req : type_REQUEST
    load_type --> start_res : type_RESPONSE
    load_type --> start_req_or_res : type_BOTH

    state "Request Parsing" as req_group {
        start_req --> method_parsing : first_char
        method_parsing --> req_first_space_before_url : method_complete
        req_first_space_before_url --> req_spaces_before_url : SP
        req_spaces_before_url --> url_entry : non_SP
    }

    state "Response Parsing" as res_group {
        start_res --> res_http_major : HTTP_slash
        res_http_major --> res_http_dot : digit
        res_http_dot --> res_http_minor : dot
        res_http_minor --> res_http_end : digit
        res_http_end --> res_status_code : SP
        res_status_code --> res_status_start : 3_digits
        res_status_start --> res_status : SP
        res_status --> res_line_almost_done : status_text
        res_line_almost_done --> header_field_start : CR_LF
    }

    state "URL Parsing" as url_group {
        url_entry --> url_entry_normal : normal_request
        url_entry --> url_entry_connect : CONNECT_method
        url_entry_normal --> url_start
        url_start --> url_schema : alpha
        url_schema --> url_schema_delim : colon
        url_schema_delim --> url_server : double_slash
        url_server --> url_path : slash_or_SP
        url_path --> url_query_or_fragment : question_mark
        url_query_or_fragment --> url_query : query_chars
        url_query --> url_fragment : hash
        url_fragment --> url_to_http : SP
    }

    url_to_http --> req_http_start : SP
    req_http_start --> req_http_major : HTTP_slash
    req_http_major --> req_http_dot : digit
    req_http_dot --> req_http_minor : dot
    req_http_minor --> req_http_complete : digit
    req_http_complete --> header_field_start : CR_LF

    state "Header Parsing" as header_group {
        header_field_start --> span_start_header_field : token_char
        header_field_start --> headers_almost_done : CR
        span_start_header_field --> header_field : start_span
        header_field --> header_field_colon : colon
        header_field_colon --> header_value_discard_ws : colon_found
        header_value_discard_ws --> span_start_header_value : non_WS
        span_start_header_value --> header_value : start_span
        header_value --> header_value_almost_done : CR
        header_value_almost_done --> header_value_lws : LF
        header_value_lws --> header_field_start : non_WS_new_header
        header_value_lws --> header_value : WS_continuation
    }

    headers_almost_done --> after_headers_complete : LF

    state "Body Handling" as body_group {
        after_headers_complete --> span_start_body : has_body
        after_headers_complete --> message_complete : no_body

        span_start_body --> consume_content_length : Content_Length
        span_start_body --> chunk_size : chunked
        span_start_body --> eof : until_close

        consume_content_length --> message_complete : length_zero

        chunk_size --> chunk_size_digit : hex_digit
        chunk_size_digit --> chunk_size_almost_done : CR
        chunk_size_almost_done --> chunk_data : LF_size_gt_0
        chunk_size_almost_done --> message_complete : LF_size_eq_0
        chunk_data --> chunk_data_almost_done : chunk_consumed
        chunk_data_almost_done --> chunk_size : CR_LF
    }

    message_complete --> on_message_complete : callback
    on_message_complete --> is_equal_upgrade : check_upgrade
    is_equal_upgrade --> after_message_complete : no_upgrade
    is_equal_upgrade --> pause_upgrade : upgrade

    after_message_complete --> start : keep_alive
    after_message_complete --> closed : close

    closed --> [*]
    pause_upgrade --> [*]

Detailed Method Parsing State Machine

stateDiagram-v2
    [*] --> start_req

    start_req --> A_branch : A
    start_req --> B_branch : B
    start_req --> C_branch : C
    start_req --> D_branch : D
    start_req --> F_branch : F
    start_req --> G_branch : G
    start_req --> H_branch : H
    start_req --> L_branch : L
    start_req --> M_branch : M
    start_req --> N_branch : N
    start_req --> O_branch : O
    start_req --> P_branch : P
    start_req --> R_branch : R
    start_req --> S_branch : S
    start_req --> T_branch : T
    start_req --> U_branch : U

    A_branch --> ACL : CL
    A_branch --> ANNOUNCE : NNOUNCE

    B_branch --> BIND : IND

    C_branch --> CHECKOUT : HECKOUT
    C_branch --> CONNECT : ONNECT
    C_branch --> COPY : OPY

    D_branch --> DELETE : ELETE
    D_branch --> DESCRIBE : ESCRIBE

    F_branch --> FLUSH : LUSH

    G_branch --> GET : ET
    G_branch --> GET_PARAMETER : ET_PARAMETER

    H_branch --> HEAD : EAD

    L_branch --> LOCK : OCK
    L_branch --> LINK : INK

    M_branch --> MKCOL : KCOL
    M_branch --> MKACTIVITY : KACTIVITY
    M_branch --> MKCALENDAR : KCALENDAR
    M_branch --> MOVE : OVE
    M_branch --> MERGE : ERGE
    M_branch --> MSEARCH : SEARCH

    N_branch --> NOTIFY : OTIFY

    O_branch --> OPTIONS : PTIONS

    P_branch --> POST : OST
    P_branch --> PUT : UT
    P_branch --> PATCH : ATCH
    P_branch --> PROPFIND : ROPFIND
    P_branch --> PROPPATCH : ROPPATCH
    P_branch --> PURGE : URGE
    P_branch --> PLAY : LAY
    P_branch --> PAUSE_METHOD : AUSE
    P_branch --> PRI : RI

    R_branch --> REPORT : EPORT
    R_branch --> REBIND : EBIND
    R_branch --> RECORD : ECORD
    R_branch --> REDIRECT : EDIRECT

    S_branch --> SEARCH : EARCH
    S_branch --> SUBSCRIBE : UBSCRIBE
    S_branch --> SOURCE : OURCE
    S_branch --> SETUP : ETUP
    S_branch --> SET_PARAMETER : ET_PARAMETER

    T_branch --> TRACE : RACE
    T_branch --> TEARDOWN : EARDOWN

    U_branch --> UNLOCK : NLOCK
    U_branch --> UNBIND : NBIND
    U_branch --> UNSUBSCRIBE : NSUBSCRIBE
    U_branch --> UNLINK : NLINK

    ACL --> req_first_space_before_url
    ANNOUNCE --> req_first_space_before_url
    BIND --> req_first_space_before_url
    CHECKOUT --> req_first_space_before_url
    CONNECT --> req_first_space_before_url
    COPY --> req_first_space_before_url
    DELETE --> req_first_space_before_url
    DESCRIBE --> req_first_space_before_url
    FLUSH --> req_first_space_before_url
    GET --> req_first_space_before_url
    GET_PARAMETER --> req_first_space_before_url
    HEAD --> req_first_space_before_url
    LOCK --> req_first_space_before_url
    LINK --> req_first_space_before_url
    MKCOL --> req_first_space_before_url
    MKACTIVITY --> req_first_space_before_url
    MKCALENDAR --> req_first_space_before_url
    MOVE --> req_first_space_before_url
    MERGE --> req_first_space_before_url
    MSEARCH --> req_first_space_before_url
    NOTIFY --> req_first_space_before_url
    OPTIONS --> req_first_space_before_url
    POST --> req_first_space_before_url
    PUT --> req_first_space_before_url
    PATCH --> req_first_space_before_url
    PROPFIND --> req_first_space_before_url
    PROPPATCH --> req_first_space_before_url
    PURGE --> req_first_space_before_url
    PLAY --> req_first_space_before_url
    PAUSE_METHOD --> req_first_space_before_url
    PRI --> req_first_space_before_url
    REPORT --> req_first_space_before_url
    REBIND --> req_first_space_before_url
    RECORD --> req_first_space_before_url
    REDIRECT --> req_first_space_before_url
    SEARCH --> req_first_space_before_url
    SUBSCRIBE --> req_first_space_before_url
    SOURCE --> req_first_space_before_url
    SETUP --> req_first_space_before_url
    SET_PARAMETER --> req_first_space_before_url
    TRACE --> req_first_space_before_url
    TEARDOWN --> req_first_space_before_url
    UNLOCK --> req_first_space_before_url
    UNBIND --> req_first_space_before_url
    UNSUBSCRIBE --> req_first_space_before_url
    UNLINK --> req_first_space_before_url

Header Value Special Processing

stateDiagram-v2
    [*] --> header_field_complete

    header_field_complete --> check_header_type

    check_header_type --> header_value_connection : Connection
    check_header_type --> header_value_content_length : Content_Length
    check_header_type --> header_value_te : Transfer_Encoding
    check_header_type --> header_value_upgrade : Upgrade
    check_header_type --> header_value : other_header

    state "Connection Header" as conn {
        header_value_connection --> connection_close : close
        header_value_connection --> connection_keep_alive : keep_alive
        header_value_connection --> connection_upgrade : upgrade
        header_value_connection --> connection_token : other_token

        connection_close --> set_F_CONNECTION_CLOSE
        connection_keep_alive --> set_F_CONNECTION_KEEP_ALIVE
        connection_upgrade --> set_F_CONNECTION_UPGRADE
        connection_token --> header_value_connection_ws
        header_value_connection_ws --> header_value_connection : comma
    }

    state "Content-Length Header" as cl {
        header_value_content_length --> content_length_digit : digit
        content_length_digit --> content_length_digit : digit
        content_length_digit --> content_length_ws : SP_or_CR
        content_length_ws --> set_F_CONTENT_LENGTH
    }

    state "Transfer-Encoding Header" as te {
        header_value_te --> te_chunked : chunked
        header_value_te --> te_token : other_token
        te_chunked --> te_chunked_last : end_of_value
        te_chunked_last --> set_F_CHUNKED
        te_token --> te_token_ows : token_complete
        te_token_ows --> header_value_te : comma
    }

    set_F_CONNECTION_CLOSE --> header_value_almost_done
    set_F_CONNECTION_KEEP_ALIVE --> header_value_almost_done
    set_F_CONNECTION_UPGRADE --> header_value_almost_done
    set_F_CONTENT_LENGTH --> header_value_almost_done
    set_F_CHUNKED --> header_value_almost_done

    header_value --> header_value_almost_done : CR
    header_value_almost_done --> [*]

Chunked Transfer Encoding State Machine

stateDiagram-v2
    [*] --> chunk_size

    chunk_size --> chunk_size_digit : hex_digit
    chunk_size_digit --> chunk_size_digit : more_hex_digits
    chunk_size_digit --> chunk_size_otherwise : non_hex

    chunk_size_otherwise --> chunk_parameters : semicolon
    chunk_size_otherwise --> chunk_size_almost_done : CR

    chunk_parameters --> chunk_parameters : param_chars
    chunk_parameters --> chunk_size_almost_done : CR

    chunk_size_almost_done --> check_chunk_size : LF

    check_chunk_size --> span_start_body : size_gt_0
    check_chunk_size --> headers_almost_done_trailers : size_eq_0

    span_start_body --> consume_chunk_data
    consume_chunk_data --> consume_chunk_data : reading_body
    consume_chunk_data --> chunk_data_almost_done : all_bytes_read

    chunk_data_almost_done --> invoke_chunk_complete : CR_LF
    invoke_chunk_complete --> chunk_size : next_chunk

    headers_almost_done_trailers --> trailer_headers : has_trailers
    headers_almost_done_trailers --> message_complete : no_trailers

    trailer_headers --> message_complete : trailers_done

    message_complete --> [*]

Error States

stateDiagram-v2
    [*] --> parsing

    parsing --> error : invalid_input

    state error {
        HPE_INTERNAL
        HPE_STRICT
        HPE_LF_EXPECTED
        HPE_UNEXPECTED_CONTENT_LENGTH
        HPE_CLOSED_CONNECTION
        HPE_INVALID_METHOD
        HPE_INVALID_URL
        HPE_INVALID_CONSTANT
        HPE_INVALID_VERSION
        HPE_INVALID_HEADER_TOKEN
        HPE_INVALID_CONTENT_LENGTH
        HPE_INVALID_CHUNK_SIZE
        HPE_INVALID_STATUS
        HPE_INVALID_EOF_STATE
        HPE_INVALID_TRANSFER_ENCODING
    }

    parsing --> paused : callback_HPE_PAUSED
    paused --> parsing : llhttp_resume

    parsing --> paused_upgrade : upgrade_detected
    paused_upgrade --> parsing : llhttp_resume_after_upgrade

Request/Response Detection (HTTP_BOTH mode)

stateDiagram-v2
    [*] --> start_req_or_res

    start_req_or_res --> req_or_res_method : H
    start_req_or_res --> start_req : other_char

    req_or_res_method --> req_or_res_method_1 : H
    req_or_res_method_1 --> req_or_res_method_2 : E
    req_or_res_method_1 --> req_or_res_method_3 : T

    req_or_res_method_2 --> HEAD_method : AD
    HEAD_method --> start_req : is_request

    req_or_res_method_3 --> start_res : TP_slash

    note right of start_req_or_res : Disambiguates between HTTP response and HEAD request

State Categories

Category States Purpose
Initialization start, load_type Entry point, determine parsing mode
Method Parsing start_req, start_req_* Parse HTTP method (GET, POST, etc.)
URL Parsing url_*, span_start_stub_* Parse request URL components
Version Parsing req_http_*, res_http_* Parse HTTP/x.y version
Status Parsing res_status_* Parse response status code and text
Header Parsing header_field_*, header_value_* Parse header names and values
Body Handling consume_content_length, chunk_*, eof Handle message body
Completion message_complete, after_message_complete Finalize message parsing
Control closed, pause_*, error Connection state management

Callbacks Triggered

Callback Trigger Point
on_message_begin Start of new message
on_url URL data span
on_url_complete URL parsing complete
on_status Status text span (responses)
on_status_complete Status parsing complete
on_header_field Header name span
on_header_field_complete Header name complete
on_header_value Header value span
on_header_value_complete Header value complete
on_headers_complete All headers parsed
on_body Body data span
on_chunk_header Chunk size parsed
on_chunk_complete Chunk data complete
on_message_complete Entire message parsed