Conversation
steveloughran
left a comment
There was a problem hiding this comment.
looks like a good design for me.
test wise, some JSON strings should be enough (hey, you can use java17 """ strings on the trunk pr!), with scans for
- valid @Class ref
- illegal ref in map a couple of levels down
- illegal ref through an array
- null handling.
this wouldn't stop anyone bypassing the client-side checks, and submitting direct, but I verified how parseJson() is reached there and it is invoked after the caller permissions check, so not only does it get validated there too, you have to be sysadmin before that validation is initiated
| return READER.readValue(tree); | ||
| } | ||
|
|
||
| // throws an IOException if any unexpected `@class` values are found |
There was a problem hiding this comment.
- make a javadoc
- highlight recursion through arrays and nested maps
| // throws an IOException if any unexpected `@class` values are found | ||
| // this exception is checked for by the calling code | ||
| private static void checkNodes(JsonNode node) throws IOException { | ||
| if (node == null) return; |
There was a problem hiding this comment.
add { } around the return, even though it's minimal on its own
There was a problem hiding this comment.
I have the format changes ready locally
|
@steveloughran I've looked at the code again and I can't find any other classes that implement Step other than MoveStep. A radical solution might be to change NodePlan in 3.5.0 to just have My worry about keeping the support for different Step implementations is that if it is valid to have other implementations, where are they? Is it feasible that 3rd parties create their own Step implementations and somehow are able to serialize them and persist them. If so, what class names would we expect and need to allow? |
|
I couldn't see any other uses either. If you think that's good then it's a lot simpler. presumably someone did the current design on the off chance a future change would add different steps. We should also be able to modify the import rule checks to block the relevant jackson tag from ever appearing in our code again, at least the direct code. |
|
💔 -1 overall
This message was automatically generated. |
|
💔 -1 overall
This message was automatically generated. |
|
💔 -1 overall
This message was automatically generated. |
steveloughran
left a comment
There was a problem hiding this comment.
commented. Though your other idea of just restrict to MoveStep is a lot simpler...
| sampleStep.setMaxDiskErrors(4567); | ||
| nodePlan.addStep(sampleStep); | ||
| String json = nodePlan.toJson(); | ||
| IOException ex = assertThrows(IOException.class, () -> NodePlan.parseJson(json)); |
There was a problem hiding this comment.
I prefer using our intercept,
intercept(IOException.class, "Invalid @class value in NodePlan JSON: sample.SampleStep", () -> NodePlan.parseJson(json));
one key diff is that if the exception isn't raised, the assertion failure includes the string value of the lambda expression output, here what nodeplan is created. Which lines you up for actually debugging failures.
| sampleStep.setMaxDiskErrors(4567); | ||
| nodePlan.addStep(sampleStep); | ||
| String json = nodePlan.toJson(); | ||
| IOException ex = assertThrows(IOException.class, () -> NodePlan.parseJson(json)); |
| nestedStep.setMaxDiskErrors(456); | ||
| nodePlan.addStep(nestedStep); | ||
| String json = nodePlan.toJson(); | ||
| IOException ex = assertThrows(IOException.class, () -> NodePlan.parseJson(json)); |
...doop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/diskbalancer/planner/TestNodePlan.java
Outdated
Show resolved
Hide resolved
|
|
||
| import java.io.IOException; | ||
|
|
||
| import static org.junit.jupiter.api.Assertions.assertEquals; |
There was a problem hiding this comment.
could you use assertJ assertions? makes it fairly trivial to backport to branch-3.4 just by changing the @test import. Otherwise, we'll need to write the assertj asserts there anyway...
|
💔 -1 overall
This message was automatically generated. |
|
💔 -1 overall
This message was automatically generated. |
| if ("@class".equals(entry.getKey())) { | ||
| String textValue = entry.getValue().asText(); | ||
| if (textValue != null && !textValue.isBlank() && | ||
| !textValue.startsWith("org.apache.hadoop.hdfs.server")) { |
There was a problem hiding this comment.
don't want to complicate things, but such a restriction should
- at least be documented in
interface Step's javadocs, or - (not prefer, might be overkill for this case) make it configurable
There was a problem hiding this comment.
making it configurable is extra complexity and risk. remove it, and if someone can see a need, then it can be considered.
There was a problem hiding this comment.
Jackson will try to deserialize to the named class. There is a disallow list for well known attack classes but it can never be complete. It was a really bad decision to use class names here. Jackson supports other approaches. But changing now is a breaking change.
|
@pjfanning how does jackson deser work here? is there a link to a doc? I'm wondering whether a class is checked for being an implementation of Step before the instance is instantiated? If it isn't, then we'd still be at risk of something in the package having adverse side effects in its construction, which of course was what happened to parquet's first deser fix last year. |
551c938 to
786cad1
Compare
|
|
💔 -1 overall
This message was automatically generated. |
steveloughran
left a comment
There was a problem hiding this comment.
Commented.
I do worry that supporting more than MoveStep is a bit of over-engineering, but it does mean if someone has done custom operations, they will still work.
| </property> | ||
| <property> | ||
| <name>dfs.nodeplan.steps.supported.packages</name> | ||
| <value>org.apache.hadoop.hdfs.server</value> |
There was a problem hiding this comment.
should we restrict to org.apache.hadoop.hdfs.server.diskbalancer.planner for even more lockdown?
...t/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/diskbalancer/planner/NodePlan.java
Outdated
Show resolved
Hide resolved
...t/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/diskbalancer/planner/NodePlan.java
Outdated
Show resolved
Hide resolved
...t/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/diskbalancer/planner/NodePlan.java
Outdated
Show resolved
Hide resolved
steveloughran
left a comment
There was a problem hiding this comment.
+1 pending changes
| private static Collection<String> getAllowedPackages() { | ||
| return CONFIGURATION.getStringCollection(SUPPORTED_PACKAGES_CONFIG_NAME) | ||
| .stream() | ||
| .map(String::trim) |
There was a problem hiding this comment.
needless, they get trimmed already. Do leave that filter in though as I'm not sure what happens there
There was a problem hiding this comment.
StringUtils.getStringCollection uses the legacy and discouraged java.util.StringTokenizer.
This StringTokenizer does not trim values - it only omits the delimiter so stray whitespace after a comma, for instance, will be returned.
public static void main(String args[]) {
String text = "a, b";
java.util.StringTokenizer tokenizer = new java.util.StringTokenizer(text, ",");
while (tokenizer.hasMoreTokens()) {
System.out.println("<" + tokenizer.nextToken() + ">");
}
}
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/DFSConfigKeys.java
Show resolved
Hide resolved
...doop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/diskbalancer/planner/TestNodePlan.java
Show resolved
Hide resolved
...doop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/diskbalancer/planner/TestNodePlan.java
Outdated
Show resolved
Hide resolved
|
💔 -1 overall
This message was automatically generated. |
Description of PR
Do sanity check on NodePlan JSON
How was this patch tested?
For code changes:
LICENSE,LICENSE-binary,NOTICE-binaryfiles?AI Tooling
If an AI tool was used:
where is the name of the AI tool used.
https://www.apache.org/legal/generative-tooling.html