Skip to content

Commit e6ca9ed

Browse files
CopilotachamayouCopilot
authored
Add ETag and If-None-Match support to GET /node/ledger-chunk/{chunk_name} (#7653)
Co-authored-by: copilot-swe-agent[bot] <198982749+Copilot@users.noreply.github.com> Co-authored-by: achamayou <4016369+achamayou@users.noreply.github.com> Co-authored-by: Amaury Chamayou <amchamay@microsoft.com> Co-authored-by: Amaury Chamayou <amaury@xargs.fr> Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
1 parent 2eda636 commit e6ca9ed

File tree

8 files changed

+471
-95
lines changed

8 files changed

+471
-95
lines changed

CHANGELOG.md

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -13,6 +13,7 @@ and this project adheres to [Semantic Versioning](http://semver.org/spec/v2.0.0.
1313

1414
- `GET` and `HEAD` `/node/ledger-chunk?since={seqno}` and `/node/ledger-chunk/{chunk_name}` endpoints, gated by the `LedgerChunkDownload` RPC interface operator feature. See [documentation](https://microsoft.github.io/CCF/main/operations/ledger_snapshot.html#download-endpoints) for more detail.
1515
- `GET` and `HEAD` `/node/ledger-chunk/{chunk_name}` and `/node/snapshot/{snapshot_name}` now support the `Want-Repr-Digest` request header and return the `Repr-Digest` response header accordingly (RFC 9530). Supported algorithms are `sha-256`, `sha-384`, and `sha-512`. If no supported algorithm is requested, the server defaults to `sha-256` (#7650).
16+
- `ETag` and `If-None-Match` support on `GET /node/ledger-chunk/{chunk_name}`, using SHA-256 by default for the `ETag` response header. Clients can supply `If-None-Match` with `sha-256`, `sha-384`, or `sha-512` digest ETags to avoid re-downloading unchanged content (#7652).
1617

1718
### Changed
1819

doc/operations/ledger_snapshot.rst

Lines changed: 53 additions & 17 deletions
Original file line numberDiff line numberDiff line change
@@ -50,26 +50,12 @@ Download Endpoints
5050
In order to faciliate long term backup of the ledger files (also called chunks), nodes can enable HTTP endpoints that allow a client to download committed ledger files.
5151
The `LedgerChunkDownload` feature must be added to `enabled_operator_features` on the relevant `rpc_interfaces` entries in the node configuration.
5252

53-
1. :http:GET:`/node/ledger-chunk/{chunk_name}` and :http:HEAD:`/node/ledger-chunk/{chunk_name}`
54-
55-
These endpoints allow downloading a specific ledger chunk by name, where `<chunk-name>` is of the form `ledger_<start_seqno>-<end_seqno>.committed`.
56-
They support the HTTP `Range` header for partial downloads, and the `HEAD` method for clients to query metadata such as the total size without downloading the full chunk.
57-
They also populate the `x-ms-ccf-ledger-chunk-name` response header with the name of the chunk being served.
58-
59-
These endpoints also support the ``Want-Repr-Digest`` request header (`RFC 9530 <https://www.rfc-editor.org/rfc/rfc9530>`_).
60-
When set, the response will include a ``Repr-Digest`` header containing the digest of the full representation of the file.
61-
Supported algorithms are ``sha-256``, ``sha-384``, and ``sha-512``. If the header contains only unsupported or invalid algorithms, the server defaults to ``sha-256`` (as permitted by `RFC 9530 Appendix C.2 <https://www.rfc-editor.org/rfc/rfc9530#appendix-C.2>`_).
62-
For example, a client sending ``Want-Repr-Digest: sha-256=1`` will receive a header such as ``Repr-Digest: sha-256=:AEGPTgUMw5e96wxZuDtpfm23RBU3nFwtgY5fw4NYORo=:`` in the response.
63-
This allows clients to verify the integrity of downloaded files and avoid re-downloading files they already hold by comparing digests.
64-
65-
.. note:: The ``Want-Repr-Digest`` / ``Repr-Digest`` support also applies to the snapshot download endpoints (``/node/snapshot/{snapshot_name}``).
66-
67-
2. :http:GET:`/node/ledger-chunk` and :http:HEAD:`/node/ledger-chunk`, both taking a `seqno` query parameter.
53+
1. :http:GET:`/node/ledger-chunk` and :http:HEAD:`/node/ledger-chunk`, both taking a `seqno` query parameter.
6854

6955
These endpoints can be used by a client to download the next ledger chunk including a given sequence number `<seqno>`.
70-
The redirects to the appropriate chunk if it exists, using the previous set of endpoints, or returns a `404 Not Found` response if no such chunk is available.
56+
They redirect to the appropriate chunk if it exists, using the endpoints described below, or return a `404 Not Found` response if no such chunk is available.
7157

72-
In the usual case, a downloading client will first hit a Backup, and will eventually want to download files recent enough that only the primary can provide them:
58+
In the typical case, a requesting client will first hit a Backup, and will eventually work its way to chunks recent enough that only the primary can provide them:
7359

7460
.. mermaid::
7561

@@ -120,6 +106,56 @@ then the following sequence can occur:
120106
Backup->>-Client: 308 Location: https://primary/node/ledger-chunk?since=51
121107
Client->>+Primary: GET /node/ledger-chunk?since=101
122108

109+
2. :http:GET:`/node/ledger-chunk/{chunk_name}` and :http:HEAD:`/node/ledger-chunk/{chunk_name}`
110+
111+
These endpoints allow downloading a specific ledger chunk by name, where `<chunk-name>` is of the form `ledger_<start_seqno>-<end_seqno>.committed`.
112+
They support the HTTP `Range` header for partial downloads, and the `HEAD` method for clients to query metadata such as the total size without downloading the full chunk.
113+
They also populate the `x-ms-ccf-ledger-chunk-name` response header with the name of the chunk being served.
114+
115+
Want-Repr-Digest and Repr-Digest
116+
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
117+
118+
These endpoints also support the ``Want-Repr-Digest`` request header (`RFC 9530 <https://www.rfc-editor.org/rfc/rfc9530>`_).
119+
When set, the response will include a ``Repr-Digest`` header containing the digest of the full representation of the file.
120+
Supported algorithms are ``sha-256``, ``sha-384``, and ``sha-512``. If the header contains only unsupported or invalid algorithms, the server defaults to ``sha-256`` (as permitted by `RFC 9530 Appendix C.2 <https://www.rfc-editor.org/rfc/rfc9530#appendix-C.2>`_).
121+
For example, a client sending ``Want-Repr-Digest: sha-256=1`` will receive a header such as ``Repr-Digest: sha-256=:AEGPTgUMw5e96wxZuDtpfm23RBU3nFwtgY5fw4NYORo=:`` in the response.
122+
This allows clients to verify the integrity of downloaded files and avoid re-downloading files they already hold by comparing digests.
123+
124+
.. note:: The ``Want-Repr-Digest`` / ``Repr-Digest`` support also applies to the snapshot download endpoints (:http:GET:`/node/snapshot/{snapshot_name}` and :http:HEAD:`/node/snapshot/{snapshot_name}`).
125+
126+
ETag and If-None-Match
127+
^^^^^^^^^^^^^^^^^^^^^^
128+
129+
``GET /node/ledger-chunk/{chunk_name}`` supports ``ETag`` and ``If-None-Match`` headers, allowing clients to atomically check whether a chunk (or a range of a chunk) has changed and re-download it in a single request, without needing a separate metadata query first.
130+
Every successful ``GET`` response includes an ``ETag`` header whose value uses the `RFC 9530 <https://www.rfc-editor.org/rfc/rfc9530>`_ digest format: ``"sha-256=:<base64_digest>:"``, where ``<base64_digest>`` is the base64-encoded SHA-256 digest of the returned content (which may be a sub-range when the ``Range`` header is used).
131+
132+
.. note:: ETag values must be surrounded by double quotes, as per `RFC 7232 <https://www.rfc-editor.org/rfc/rfc7232#section-2.3>`_.
133+
134+
Clients can send an ``If-None-Match`` request header containing one or more ETags. If the content matches any of the provided ETags, the server responds with ``304 Not Modified`` instead of re-sending the body. The supported digest algorithms are ``sha-256``, ``sha-384``, and ``sha-512``.
135+
When the client already holds a chunk and wants to check if it has changed, it sends the previously received ETag in ``If-None-Match``. If the content has not changed, the server responds with ``304 Not Modified``, saving bandwidth:
136+
137+
.. note:: The node currently defaults to ``sha-256`` for ETags, but clients can also send other supported digest algorithms in the ``If-None-Match`` header, and the server will use the first supported one it finds. For example, if the client sends ``If-None-Match: "sha-384=:AAAA...=:", "sha-256=:47DEQpj8HBSa+/TImW...=:"``, the server will use the ``sha-384`` digest if it supports it, and fall back to ``sha-256`` otherwise.
138+
139+
.. mermaid::
140+
141+
sequenceDiagram
142+
Note over Client: Client already has chunk with known ETag
143+
Client->>+Node: GET /node/ledger-chunk/ledger_1-100.committed<br/>If-None-Match: "sha-256=:47DEQpj8HBSa+/TImW...=:"
144+
Note over Node: Computes digest, matches If-None-Match
145+
Node->>-Client: 304 Not Modified<br/>ETag: "sha-256=:47DEQpj8HBSa+/TImW...=:"
146+
Note over Client: No body transferred, client keeps existing copy
147+
148+
If the ``If-None-Match`` ETag does not match the current content (e.g. the client has an outdated copy, or is checking a different chunk), the server returns the full content as a fresh download:
149+
150+
.. mermaid::
151+
152+
sequenceDiagram
153+
Note over Client: Client sends an ETag that does not match
154+
Client->>+Node: GET /node/ledger-chunk/ledger_1-100.committed<br/>If-None-Match: "sha-256=:AAAA...=:"
155+
Note over Node: Computes digest, does not match If-None-Match
156+
Node->>-Client: 200 OK<br/>ETag: "sha-256=:47DEQpj8HBSa+/TImW...=:"<br/><Chunk Contents>
157+
Note over Client: Client stores chunk and new ETag for future requests
158+
123159
Snapshots
124160
---------
125161

include/ccf/http_consts.h

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -21,6 +21,8 @@ namespace ccf
2121
static constexpr auto HOST = "host";
2222
static constexpr auto LOCATION = "location";
2323
static constexpr auto RANGE = "range";
24+
static constexpr auto ETAG = "etag";
25+
static constexpr auto IF_NONE_MATCH = "if-none-match";
2426
static constexpr auto REPR_DIGEST = "repr-digest";
2527
static constexpr auto RETRY_AFTER = "retry-after";
2628
static constexpr auto TRAILER = "trailer";

include/ccf/http_etag.h

Lines changed: 19 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -3,12 +3,28 @@
33
#pragma once
44

55
#define FMT_HEADER_ONLY
6+
#include <exception>
67
#include <regex>
78
#include <set>
89
#include <string>
910

1011
namespace ccf::http
1112
{
13+
/** Exception thrown when the Matcher encounters an invalid ETag header. */
14+
class MatcherError : public std::exception
15+
{
16+
private:
17+
std::string msg;
18+
19+
public:
20+
MatcherError(std::string msg_) : msg(std::move(msg_)) {}
21+
22+
[[nodiscard]] const char* what() const noexcept override
23+
{
24+
return msg.c_str();
25+
}
26+
};
27+
1228
/** Utility class to resolve If-Match and If-None-Match as described
1329
* in https://www.rfc-editor.org/rfc/rfc9110#field.if-match
1430
*/
@@ -33,7 +49,7 @@ namespace ccf::http
3349
return;
3450
}
3551

36-
std::regex etag_rx(R"(\"([0-9a-f]+)\",?\s*)");
52+
std::regex etag_rx(R"(\"([^\"]+)\",?\s*)");
3753
auto etags_begin =
3854
std::sregex_iterator(match_header.begin(), match_header.end(), etag_rx);
3955
auto etags_end = std::sregex_iterator();
@@ -43,7 +59,7 @@ namespace ccf::http
4359
{
4460
if (i->position() != last_matched)
4561
{
46-
throw std::runtime_error("Invalid If-Match header");
62+
throw MatcherError("Invalid If-Match header");
4763
}
4864
const std::smatch& match = *i;
4965
if_etags.insert(match[1].str());
@@ -54,7 +70,7 @@ namespace ccf::http
5470

5571
if (last_matched != last_index_in_header || if_etags.empty())
5672
{
57-
throw std::runtime_error("Invalid If-Match header");
73+
throw MatcherError("Invalid If-Match header");
5874
}
5975
}
6076

samples/apps/logging/logging.cpp

Lines changed: 86 additions & 26 deletions
Original file line numberDiff line numberDiff line change
@@ -843,25 +843,45 @@ namespace loggingapp
843843
// side-effect.
844844
if (match_headers.if_match.has_value())
845845
{
846-
ccf::http::Matcher matcher(match_headers.if_match.value());
847-
if (!matcher.matches(etag))
846+
try
847+
{
848+
ccf::http::Matcher matcher(match_headers.if_match.value());
849+
if (!matcher.matches(etag))
850+
{
851+
return ccf::make_error(
852+
HTTP_STATUS_PRECONDITION_FAILED,
853+
ccf::errors::PreconditionFailed,
854+
"Resource has changed.");
855+
}
856+
}
857+
catch (const ccf::http::MatcherError& e)
848858
{
849859
return ccf::make_error(
850-
HTTP_STATUS_PRECONDITION_FAILED,
851-
ccf::errors::PreconditionFailed,
852-
"Resource has changed.");
860+
HTTP_STATUS_BAD_REQUEST,
861+
ccf::errors::InvalidHeaderValue,
862+
e.what());
853863
}
854864
}
855865

856866
if (match_headers.if_none_match.has_value())
857867
{
858-
ccf::http::Matcher matcher(match_headers.if_none_match.value());
859-
if (matcher.matches(etag))
868+
try
869+
{
870+
ccf::http::Matcher matcher(match_headers.if_none_match.value());
871+
if (matcher.matches(etag))
872+
{
873+
return ccf::make_error(
874+
HTTP_STATUS_PRECONDITION_FAILED,
875+
ccf::errors::PreconditionFailed,
876+
"Resource has changed.");
877+
}
878+
}
879+
catch (const ccf::http::MatcherError& e)
860880
{
861881
return ccf::make_error(
862-
HTTP_STATUS_PRECONDITION_FAILED,
863-
ccf::errors::PreconditionFailed,
864-
"Resource has changed.");
882+
HTTP_STATUS_BAD_REQUEST,
883+
ccf::errors::InvalidHeaderValue,
884+
e.what());
865885
}
866886
}
867887
}
@@ -935,23 +955,43 @@ namespace loggingapp
935955

936956
if (match_headers.if_match.has_value())
937957
{
938-
ccf::http::Matcher matcher(match_headers.if_match.value());
939-
if (!matcher.matches(etag))
958+
try
959+
{
960+
ccf::http::Matcher matcher(match_headers.if_match.value());
961+
if (!matcher.matches(etag))
962+
{
963+
return ccf::make_error(
964+
HTTP_STATUS_PRECONDITION_FAILED,
965+
ccf::errors::PreconditionFailed,
966+
"Resource has changed.");
967+
}
968+
}
969+
catch (const ccf::http::MatcherError& e)
940970
{
941971
return ccf::make_error(
942-
HTTP_STATUS_PRECONDITION_FAILED,
943-
ccf::errors::PreconditionFailed,
944-
"Resource has changed.");
972+
HTTP_STATUS_BAD_REQUEST,
973+
ccf::errors::InvalidHeaderValue,
974+
e.what());
945975
}
946976
}
947977

948978
// On a GET, If-None-Match passing returns 304 Not Modified
949979
if (match_headers.if_none_match.has_value())
950980
{
951-
ccf::http::Matcher matcher(match_headers.if_none_match.value());
952-
if (matcher.matches(etag))
981+
try
982+
{
983+
ccf::http::Matcher matcher(match_headers.if_none_match.value());
984+
if (matcher.matches(etag))
985+
{
986+
return ccf::make_redirect(HTTP_STATUS_NOT_MODIFIED);
987+
}
988+
}
989+
catch (const ccf::http::MatcherError& e)
953990
{
954-
return ccf::make_redirect(HTTP_STATUS_NOT_MODIFIED);
991+
return ccf::make_error(
992+
HTTP_STATUS_BAD_REQUEST,
993+
ccf::errors::InvalidHeaderValue,
994+
e.what());
955995
}
956996
}
957997

@@ -1030,22 +1070,42 @@ namespace loggingapp
10301070

10311071
if (match_headers.if_match.has_value())
10321072
{
1033-
ccf::http::Matcher matcher(match_headers.if_match.value());
1034-
if (!matcher.matches(etag))
1073+
try
1074+
{
1075+
ccf::http::Matcher matcher(match_headers.if_match.value());
1076+
if (!matcher.matches(etag))
1077+
{
1078+
return ccf::make_error(
1079+
HTTP_STATUS_PRECONDITION_FAILED,
1080+
ccf::errors::PreconditionFailed,
1081+
"Resource has changed.");
1082+
}
1083+
}
1084+
catch (const ccf::http::MatcherError& e)
10351085
{
10361086
return ccf::make_error(
1037-
HTTP_STATUS_PRECONDITION_FAILED,
1038-
ccf::errors::PreconditionFailed,
1039-
"Resource has changed.");
1087+
HTTP_STATUS_BAD_REQUEST,
1088+
ccf::errors::InvalidHeaderValue,
1089+
e.what());
10401090
}
10411091
}
10421092

10431093
if (match_headers.if_none_match.has_value())
10441094
{
1045-
ccf::http::Matcher matcher(match_headers.if_none_match.value());
1046-
if (matcher.matches(etag))
1095+
try
10471096
{
1048-
return ccf::make_redirect(HTTP_STATUS_NOT_MODIFIED);
1097+
ccf::http::Matcher matcher(match_headers.if_none_match.value());
1098+
if (matcher.matches(etag))
1099+
{
1100+
return ccf::make_redirect(HTTP_STATUS_NOT_MODIFIED);
1101+
}
1102+
}
1103+
catch (const ccf::http::MatcherError& e)
1104+
{
1105+
return ccf::make_error(
1106+
HTTP_STATUS_BAD_REQUEST,
1107+
ccf::errors::InvalidHeaderValue,
1108+
e.what());
10491109
}
10501110
}
10511111
}

src/http/test/http_etag_test.cpp

Lines changed: 41 additions & 9 deletions
Original file line numberDiff line numberDiff line change
@@ -39,37 +39,69 @@ TEST_CASE("If-Match: \"abc\", \"def\"")
3939
TEST_CASE("If-Match invalid inputs")
4040
{
4141
REQUIRE_THROWS_AS_MESSAGE(
42-
ccf::http::Matcher im(""), std::runtime_error, "Invalid If-Match header");
42+
ccf::http::Matcher im(""),
43+
ccf::http::MatcherError,
44+
"Invalid If-Match header");
4345
REQUIRE_THROWS_AS_MESSAGE(
4446
ccf::http::Matcher im("not etags"),
45-
std::runtime_error,
47+
ccf::http::MatcherError,
4648
"Invalid If-Match header");
4749
REQUIRE_THROWS_AS_MESSAGE(
4850
ccf::http::Matcher im("\"abc\", not etags"),
49-
std::runtime_error,
51+
ccf::http::MatcherError,
5052
"Invalid If-Match header");
5153
REQUIRE_THROWS_AS_MESSAGE(
5254
ccf::http::Matcher im("not etags, \"abc\""),
53-
std::runtime_error,
55+
ccf::http::MatcherError,
5456
"Invalid If-Match header");
5557
REQUIRE_THROWS_AS_MESSAGE(
5658
ccf::http::Matcher im("W/\"abc\""),
57-
std::runtime_error,
59+
ccf::http::MatcherError,
5860
"Invalid If-Match header");
5961
REQUIRE_THROWS_AS_MESSAGE(
6062
ccf::http::Matcher im("W/\"abc\", \"def\""),
61-
std::runtime_error,
63+
ccf::http::MatcherError,
6264
"Invalid If-Match header");
6365
REQUIRE_THROWS_AS_MESSAGE(
6466
ccf::http::Matcher im("\"abc\", \"def"),
65-
std::runtime_error,
67+
ccf::http::MatcherError,
6668
"Invalid If-Match header");
6769
REQUIRE_THROWS_AS_MESSAGE(
6870
ccf::http::Matcher im("\"abc\",, \"def\""),
69-
std::runtime_error,
71+
ccf::http::MatcherError,
7072
"Invalid If-Match header");
7173
REQUIRE_THROWS_AS_MESSAGE(
7274
ccf::http::Matcher im(",\"abc\""),
73-
std::runtime_error,
75+
ccf::http::MatcherError,
7476
"Invalid If-Match header");
77+
}
78+
79+
TEST_CASE("If-None-Match with RFC 9530 digest ETag format")
80+
{
81+
// Single sha-256 ETag in RFC 9530 structured field format
82+
{
83+
ccf::http::Matcher im(
84+
"\"sha-256=:47DEQpj8HBSa+/TImW+5JCeuQeRkm5NMpJWZG3hSuFU=:\"");
85+
REQUIRE(!im.is_any());
86+
REQUIRE(
87+
im.matches("sha-256=:47DEQpj8HBSa+/TImW+5JCeuQeRkm5NMpJWZG3hSuFU=:"));
88+
REQUIRE(!im.matches("sha-256=:AAAA:"));
89+
}
90+
91+
// Multiple algorithm ETags
92+
{
93+
ccf::http::Matcher im(
94+
"\"sha-256=:abc=:\", \"sha-384=:def=:\", \"sha-512=:ghi=:\"");
95+
REQUIRE(im.matches("sha-256=:abc=:"));
96+
REQUIRE(im.matches("sha-384=:def=:"));
97+
REQUIRE(im.matches("sha-512=:ghi=:"));
98+
REQUIRE(!im.matches("sha-256=:000=:"));
99+
}
100+
101+
// Wildcard still works
102+
{
103+
ccf::http::Matcher im("*");
104+
REQUIRE(im.is_any());
105+
REQUIRE(im.matches("sha-256=:anything:"));
106+
}
75107
}

0 commit comments

Comments
 (0)