-
Notifications
You must be signed in to change notification settings - Fork 209
[feat]: added one-logger support for NeMo RL #1631
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
Signed-off-by: Naveenraj Kamalakannan <[email protected]>
📝 WalkthroughWalkthroughThis pull request integrates NV-One-Logger as a new logging backend into the NeMo RL framework. It adds the nv-one-logger submodule, creates wrapper packages with setup and configuration files, introduces the OneLogger class implementing the LoggerInterface, and updates project dependencies and example configurations to support the new logger backend. Changes
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~30 minutes
Suggested labels
Suggested reviewers
Pre-merge checks and finishing touches❌ Failed checks (2 warnings)
✅ Passed checks (2 passed)
✨ Finishing touches
🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 14
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
nemo_rl/utils/logger.py (1)
948-978: Require explicit cfg["one_logger"] configuration when one_logger_enabled is True, consistent with other backends.The one_logger initialization uses
cfg.get("one_logger", {})which allows silent fallback to an empty config dict, whereas wandb, swanlab, tensorboard, and mlflow all require their config keys to be present (using directcfg[key]access). This inconsistency can result in OneLogger being silently initialized with defaults instead of failing fast when configuration is missing.Change from:
if cfg["one_logger_enabled"]: # OneLogger needs no log dir setup self.one_logger = OneLogger(cfg.get("one_logger", {}), log_dir=self.base_log_dir) self.loggers.append(self.one_logger)To:
if cfg["one_logger_enabled"]: one_logger_cfg = cfg["one_logger"] # Require key to be present # OneLogger needs no log dir setup self.one_logger = OneLogger(one_logger_cfg, log_dir=self.base_log_dir) self.loggers.append(self.one_logger)This matches the pattern used by all other logging backends and ensures configuration is explicitly provided.
🧹 Nitpick comments (7)
pyproject.toml (1)
191-191: Minor clarity improvement for the comment.The comment starting with "If we don't do this" could be slightly clearer about the consequence.
Consider this diff for improved clarity:
- # If we don't do this, the research projects do not see the global uv.lock, and may mistakenly + # If we don't do this, research projects will not see the global uv.lock and may mistakenly3rdparty/NV-One-Logger-workspace/pyproject.toml (1)
5-9: Consider aligning Python version requirement with project standards.The
requires-python = ">=3.8"may conflict with the project's Python 3.12+ guideline. If this wrapper is intended for use only within NeMo-RL (which targets 3.12+), consider updating torequires-python = ">=3.12". However, if maintaining compatibility with the upstream nv-one-logger package is intentional, this can remain as-is.3rdparty/NV-One-Logger-workspace/wrappers/nv-one-logger-otel/setup.py (2)
26-30: Poetry caret conversion loses upper-bound semantics.Poetry's
^specifier (e.g.,^1.2.3) means>=1.2.3, <2.0.0for major versions. Converting to just>=1.2.3removes the upper bound, potentially allowing breaking changes. If strict version compatibility is needed, consider using thepoetry-coreorpoetry-semverlibraries for accurate conversion, or explicitly set upper bounds.
17-59: Consider extracting shared logic to reduce duplication.This dependency extraction and setup logic is duplicated across all four
setup.pyfiles (core, otel, wandb, training-telemetry). Consider creating a shared utility module (e.g.,_setup_utils.py) that handles Poetry-to-setuptools conversion, then have eachsetup.pyimport and call it with package-specific parameters.Example shared utility approach:
# 3rdparty/NV-One-Logger-workspace/_setup_utils.py def setup_nv_logger_package(src_dir, name, description): """Shared setup logic for nv-one-logger wrappers.""" # ... common logic here ...Then each setup.py becomes:
from _setup_utils import setup_nv_logger_package setup_nv_logger_package( src_dir=Path(__file__).parent.parent.parent / "nv-one-logger" / "nv_one_logger" / "one_logger_otel", name="nv-one-logger-otel", description="Extensions to onelogger library to use Open telemetry (OTEL) as a backend.", )3rdparty/NV-One-Logger-workspace/wrappers/nv-one-logger-wandb/setup.py (1)
6-59: Duplicated setup logic.This file duplicates the same Poetry-to-setuptools conversion logic found in the other wrapper setup.py files. Please consider the shared utility refactor suggested in the otel wrapper review to improve maintainability.
3rdparty/NV-One-Logger-workspace/wrappers/nv-one-logger-training-telemetry/setup.py (1)
6-59: Duplicated setup logic.Same duplication concern as other wrappers. Please consider the shared utility refactor suggested in the otel wrapper review.
3rdparty/NV-One-Logger-workspace/setup.py (1)
6-59: Duplicated setup logic - consider this file as the host for shared utility.This root setup.py shares identical logic with all wrapper setup.py files. Consider placing a shared
_setup_utils.pymodule here in3rdparty/NV-One-Logger-workspace/that all setup.py files can import from.
📜 Review details
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
⛔ Files ignored due to path filters (1)
uv.lockis excluded by!**/*.lock
📒 Files selected for processing (15)
.gitmodules(1 hunks)3rdparty/NV-One-Logger-workspace/is_nvlogger_installed.py(1 hunks)3rdparty/NV-One-Logger-workspace/nv-one-logger(1 hunks)3rdparty/NV-One-Logger-workspace/pyproject.toml(1 hunks)3rdparty/NV-One-Logger-workspace/setup.py(1 hunks)3rdparty/NV-One-Logger-workspace/wrappers/nv-one-logger-otel/pyproject.toml(1 hunks)3rdparty/NV-One-Logger-workspace/wrappers/nv-one-logger-otel/setup.py(1 hunks)3rdparty/NV-One-Logger-workspace/wrappers/nv-one-logger-training-telemetry/pyproject.toml(1 hunks)3rdparty/NV-One-Logger-workspace/wrappers/nv-one-logger-training-telemetry/setup.py(1 hunks)3rdparty/NV-One-Logger-workspace/wrappers/nv-one-logger-wandb/pyproject.toml(1 hunks)3rdparty/NV-One-Logger-workspace/wrappers/nv-one-logger-wandb/setup.py(1 hunks)examples/configs/grpo_math_1B.yaml(1 hunks)examples/configs/sft.yaml(1 hunks)nemo_rl/utils/logger.py(7 hunks)pyproject.toml(4 hunks)
🧰 Additional context used
📓 Path-based instructions (4)
!(**/tests/**|**/test_*.py|**/test_*.sh)
📄 CodeRabbit inference engine (CODING_GUIDELINES.md)
Add the NVIDIA copyright header to all Python files and shell scripts (excluding tests). The header should include the current year
Files:
3rdparty/NV-One-Logger-workspace/wrappers/nv-one-logger-training-telemetry/pyproject.toml.gitmodulesexamples/configs/grpo_math_1B.yaml3rdparty/NV-One-Logger-workspace/setup.py3rdparty/NV-One-Logger-workspace/wrappers/nv-one-logger-wandb/setup.pypyproject.toml3rdparty/NV-One-Logger-workspace/nv-one-logger3rdparty/NV-One-Logger-workspace/wrappers/nv-one-logger-otel/pyproject.toml3rdparty/NV-One-Logger-workspace/pyproject.toml3rdparty/NV-One-Logger-workspace/wrappers/nv-one-logger-training-telemetry/setup.py3rdparty/NV-One-Logger-workspace/wrappers/nv-one-logger-wandb/pyproject.toml3rdparty/NV-One-Logger-workspace/is_nvlogger_installed.pyexamples/configs/sft.yaml3rdparty/NV-One-Logger-workspace/wrappers/nv-one-logger-otel/setup.pynemo_rl/utils/logger.py
**/*.py
📄 CodeRabbit inference engine (CODING_GUIDELINES.md)
**/*.py: Conform code to Python 3.12+
Indent code with 4 spaces. Do not use tabs
Use snake_case for file names
Use PascalCase for class names
Use snake_case for function and method names
Use snake_case for local variables
Prefix variable names that start with a number with 'k' (e.g., k_99th_percentile)
Use upper snake_case with 'G' prefix for global variables (e.g., G_MY_GLOBAL)
Use upper snake_case for constants
Avoid shadowing variables declared in an outer scope
Initialize all externally visible members of a class in the constructor
Prefer docstrings over comments for interfaces that may be used outside a file
Reserve comments for code within a function or interfaces that are local to a file
If a piece of code is commented out, include a comment describing its usage and why it's commented out. Remove debug comments before merging
Use Google style docstrings for classes and functions in Python, which can be parsed by Sphinx
Avoid using reflection when functionality can be easily achieved without reflection
When using try-except blocks, limit the except clause to the smallest set of specific errors possible
When using try-except blocks for duck-typing, keep the body of the try as small as possible and use the else block for logic
YAML is the single source of truth for configuration defaults. Do not set non-None defaults in code for configuration values
For required configuration attributes, access config directly and expect presence (e.g., policy_cfg['precision']) without hidden defaults
Use typing.NotRequired to mark optional attributes in TypedDict for configuration
When adding a new config key to a TypedDict subclass, document the key's purpose, valid values/types, and recommended default, and reflect the default in exemplar YAMLs under examples/configs/*.yaml
Follow the Google Python Style Guide for Python code
Files:
3rdparty/NV-One-Logger-workspace/setup.py3rdparty/NV-One-Logger-workspace/wrappers/nv-one-logger-wandb/setup.py3rdparty/NV-One-Logger-workspace/wrappers/nv-one-logger-training-telemetry/setup.py3rdparty/NV-One-Logger-workspace/is_nvlogger_installed.py3rdparty/NV-One-Logger-workspace/wrappers/nv-one-logger-otel/setup.pynemo_rl/utils/logger.py
**/*.{py,sh}
📄 CodeRabbit inference engine (CODING_GUIDELINES.md)
The NVIDIA copyright header should appear at the top of all Python files and shell scripts (excluding tests)
Files:
3rdparty/NV-One-Logger-workspace/setup.py3rdparty/NV-One-Logger-workspace/wrappers/nv-one-logger-wandb/setup.py3rdparty/NV-One-Logger-workspace/wrappers/nv-one-logger-training-telemetry/setup.py3rdparty/NV-One-Logger-workspace/is_nvlogger_installed.py3rdparty/NV-One-Logger-workspace/wrappers/nv-one-logger-otel/setup.pynemo_rl/utils/logger.py
nemo_rl/**/*.py
📄 CodeRabbit inference engine (CODING_GUIDELINES.md)
For any source file under nemo_rl/*.py that defines a class or function decorated with @ray.remote, add a coverage pragma (# pragma: no cover) because these run in separate Ray processes
Files:
nemo_rl/utils/logger.py
🪛 Ruff (0.14.8)
3rdparty/NV-One-Logger-workspace/setup.py
10-10: Avoid specifying long messages outside the exception class
(TRY003)
3rdparty/NV-One-Logger-workspace/wrappers/nv-one-logger-wandb/setup.py
10-10: Avoid specifying long messages outside the exception class
(TRY003)
3rdparty/NV-One-Logger-workspace/wrappers/nv-one-logger-training-telemetry/setup.py
10-10: Avoid specifying long messages outside the exception class
(TRY003)
3rdparty/NV-One-Logger-workspace/is_nvlogger_installed.py
15-15: Unused noqa directive (non-enabled: F401)
Remove unused noqa directive
(RUF100)
3rdparty/NV-One-Logger-workspace/wrappers/nv-one-logger-otel/setup.py
10-10: Avoid specifying long messages outside the exception class
(TRY003)
nemo_rl/utils/logger.py
467-467: Unused method argument: log_dir
(ARG002)
512-512: Unused method argument: step_metric
(ARG002)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (2)
- GitHub Check: Post submodule check comment / Comment on PR
- GitHub Check: Post automodel integration comment / Comment on PR
🔇 Additional comments (5)
examples/configs/sft.yaml (1)
209-212: LGTM! Consistent logger configuration pattern.The new
one_loggerconfiguration follows the established pattern used by other loggers (wandb, swanlab) with appropriate variable interpolation and safe default (one_logger_enabled: false).examples/configs/grpo_math_1B.yaml (1)
286-289: LGTM! Consistent with sft.yaml configuration.The
one_loggerconfiguration is consistent with the pattern established insft.yamland other logger configurations, ensuring uniformity across example configs.3rdparty/NV-One-Logger-workspace/is_nvlogger_installed.py (1)
1-21: LGTM! Copyright header and structure are correct.The file follows coding guidelines correctly:
- NVIDIA copyright header with current year (2025)
- Upper snake case for module-level constant
INSTALLED- Clean try/except pattern for import detection
The Ruff RUF100 hint about the unused
noqadirective is a false positive—thenoqa: F401is necessary since F401 (unused import) is enabled in the project's ruff configuration (line 316 of rootpyproject.toml), and the import is intentionally unused as it's only checked for availability.pyproject.toml (2)
158-161: LGTM! Proper workspace configuration.The workspace sources, members, and no-build-isolation-package entries are correctly configured and follow the same pattern as other workspace packages (e.g., megatron-core, megatron-bridge).
Also applies to: 185-189, 218-221
52-55: The nv-one-logger packages should remain as required dependencies independencies, not optional.The packages are unconditionally imported at the top level of
nemo_rl/utils/logger.py(lines 57–63), which itself is unconditionally imported across all main example scripts and algorithm modules (sft.py, distillation.py, dpo.py, grpo.py, rm.py, etc.). If nv-one-logger were optional, importing any of these modules would fail if the packages were not installed. Theone_logger_enabledconfiguration flag only controls whether the logger is used, not whether it can be imported.
| [submodule "3rdparty/NV-One-Logger-workspace/nv-one-logger"] | ||
| path = 3rdparty/NV-One-Logger-workspace/nv-one-logger | ||
| url = [email protected]:NVIDIA/nv-one-logger.git |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🛠️ Refactor suggestion | 🟠 Major
🧩 Analysis chain
🏁 Script executed:
cat -n .gitmodulesRepository: NVIDIA-NeMo/RL
Length of output: 1102
🏁 Script executed:
curl -s -o /dev/null -w "%{http_code}" https://github.com/NVIDIA/nv-one-loggerRepository: NVIDIA-NeMo/RL
Length of output: 61
Add branch and shallow clone configuration for consistency.
The new submodule entry lacks branch and shallow properties that are present in all four existing submodules. Additionally, it uses an SSH URL instead of HTTPS like all other NVIDIA submodules. This creates inconsistency and may lead to unintended behavior:
- Without
shallow = true, the submodule will perform full clones instead of shallow clones, increasing download size and time. - Without a
branchspecification, the submodule will track HEAD rather than a stable, pinned branch. - The SSH URL (
[email protected]:) differs from the HTTPS URLs used for all other NVIDIA submodules in this file.
Apply this diff to align with existing submodule configurations:
[submodule "3rdparty/NV-One-Logger-workspace/nv-one-logger"]
path = 3rdparty/NV-One-Logger-workspace/nv-one-logger
- url = [email protected]:NVIDIA/nv-one-logger.git
+ url = https://github.com/NVIDIA/nv-one-logger.git
+ branch = main
+ shallow = true📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| [submodule "3rdparty/NV-One-Logger-workspace/nv-one-logger"] | |
| path = 3rdparty/NV-One-Logger-workspace/nv-one-logger | |
| url = [email protected]:NVIDIA/nv-one-logger.git | |
| [submodule "3rdparty/NV-One-Logger-workspace/nv-one-logger"] | |
| path = 3rdparty/NV-One-Logger-workspace/nv-one-logger | |
| url = https://github.com/NVIDIA/nv-one-logger.git | |
| branch = main | |
| shallow = true |
🤖 Prompt for AI Agents
In .gitmodules around lines 21-23, the NV-One-Logger submodule entry is missing
the branch and shallow settings and uses an SSH URL; update the entry to match
the other NVIDIA submodules by changing the URL to the HTTPS form, adding a
branch line matching the other NVIDIA submodules (e.g., branch = main or the
same branch used by the other entries), and add shallow = true so the submodule
is shallow-cloned.
| @@ -0,0 +1 @@ | |||
| Subproject commit c382dd5193ae987233c7a61de33fb16f91724b7d | |||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🧩 Analysis chain
🏁 Script executed:
#!/bin/bash
echo "=== Checking repository structure ==="
ls -la 3rdparty/ 2>/dev/null | head -20 || echo "3rdparty directory listing failed"
echo -e "\n=== Checking if submodule file exists ==="
if [ -f "3rdparty/NV-One-Logger-workspace/nv-one-logger" ]; then
echo "File found. Content:"
cat "3rdparty/NV-One-Logger-workspace/nv-one-logger"
else
echo "File not found at specified path"
fi
echo -e "\n=== Checking .gitmodules configuration ==="
if [ -f .gitmodules ]; then
echo ".gitmodules content:"
cat .gitmodules
else
echo ".gitmodules file not found"
fi
echo -e "\n=== Searching for OneLogger references in codebase ==="
rg -i "onelogger|one.?logger" --type py --type sh 2>/dev/null | head -30 || echo "No OneLogger references found"
echo -e "\n=== Searching for test files ==="
find . -type f \( -name "*test*.py" -o -name "*test*.sh" \) 2>/dev/null | grep -i "logger\|log" || echo "No logger-related test files found"
echo -e "\n=== Searching for documentation files mentioning OneLogger ==="
rg -i "onelogger|one.?logger" --type md --type txt 2>/dev/null | head -30 || echo "No documentation found mentioning OneLogger"
echo -e "\n=== Listing Python files (excluding tests) to check for imports ==="
fd -e py . --exclude tests --exclude test_\* 2>/dev/null | xargs grep -l "onelogger\|nv_one_logger" 2>/dev/null || echo "No Python files import OneLogger"Repository: NVIDIA-NeMo/RL
Length of output: 4838
🏁 Script executed:
head -100 tests/unit/utils/test_logger.pyRepository: NVIDIA-NeMo/RL
Length of output: 3519
🏁 Script executed:
rg -n "OneLogger" tests/unit/utils/test_logger.pyRepository: NVIDIA-NeMo/RL
Length of output: 40
Add OneLogger integration tests and documentation.
OneLogger is integrated in nemo_rl/utils/logger.py but lacks test coverage and documentation. Before merging:
- Add unit tests for the
OneLoggerclass intests/unit/utils/test_logger.py, including configuration, initialization, and logging functionality. - Add documentation describing OneLogger setup, configuration options, and usage examples (e.g., README or integration guide).
- Verify the submodule commit hash in
.gitmodulesis accessible and stable in the NVIDIA/nv-one-logger repository.
🤖 Prompt for AI Agents
In 3rdparty/NV-One-Logger-workspace/nv-one-logger around lines 1 to 1, tests and
documentation for OneLogger are missing; add unit tests at
tests/unit/utils/test_logger.py that instantiate OneLogger with various
valid/invalid configurations, assert initialization state, and verify that log
methods emit expected messages (use a temporary config and capture logs or mock
handlers), create documentation (README or integration guide) detailing setup
steps, available configuration options and usage examples showing initialization
and common logging calls, and finally confirm the .gitmodules submodule commit
hash points to an accessible, stable commit in the NVIDIA/nv-one-logger repo and
update it if necessary.
| import sys | ||
| import toml | ||
| from pathlib import Path | ||
| import setuptools |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🛠️ Refactor suggestion | 🟠 Major
Missing NVIDIA copyright header and unused import.
Same issues as the wrapper setup.py files: add the NVIDIA copyright header and remove the unused sys import. As per coding guidelines for Python files.
Apply this diff:
+# SPDX-FileCopyrightText: Copyright (c) 2025 NVIDIA CORPORATION & AFFILIATES. All rights reserved.
+# SPDX-License-Identifier: Apache-2.0
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+# http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+
-import sys
import toml
from pathlib import Path
import setuptools🤖 Prompt for AI Agents
In 3rdparty/NV-One-Logger-workspace/setup.py around lines 1-4, the file is
missing the required NVIDIA copyright header and contains an unused import of
sys; remove the unused "import sys" line and prepend the standard NVIDIA
copyright header comment block at the top of the file (matching other wrapper
setup.py files in the repo) so the file begins with the header followed by the
remaining imports (toml, Path, setuptools).
| [project] | ||
| name = "nv-one-logger-otel" | ||
| description = "Wrapper for nv-one-logger-otel" | ||
| requires-python = ">=3.8" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Inconsistent Python version requirement.
The requires-python = ">=3.8" setting is inconsistent with the root pyproject.toml which specifies requires-python = ">=3.12" (line 16) and the coding guidelines which require Python 3.12+ conformance. This could allow installation in unsupported Python environments.
Apply this diff to align with project standards:
-requires-python = ">=3.8"
+requires-python = ">=3.12"📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| requires-python = ">=3.8" | |
| requires-python = ">=3.12" |
🤖 Prompt for AI Agents
In 3rdparty/NV-One-Logger-workspace/wrappers/nv-one-logger-otel/pyproject.toml
around line 8, update the Python requirement from ">=3.8" to ">=3.12" to match
the root pyproject.toml and project guidelines; edit the requires-python field
to ">=3.12" so the package cannot be installed on older, unsupported Python
versions and ensure any related CI/metadata remains consistent.
| import sys | ||
| import toml | ||
| from pathlib import Path | ||
| import setuptools |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🛠️ Refactor suggestion | 🟠 Major
Missing NVIDIA copyright header and unused import.
Per coding guidelines, Python files require the NVIDIA copyright header. Also, sys is imported but never used.
Apply this diff:
+# SPDX-FileCopyrightText: Copyright (c) 2025 NVIDIA CORPORATION & AFFILIATES. All rights reserved.
+# SPDX-License-Identifier: Apache-2.0
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+# http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+
-import sys
import toml
from pathlib import Path
import setuptools📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| import sys | |
| import toml | |
| from pathlib import Path | |
| import setuptools | |
| # SPDX-FileCopyrightText: Copyright (c) 2025 NVIDIA CORPORATION & AFFILIATES. All rights reserved. | |
| # SPDX-License-Identifier: Apache-2.0 | |
| # | |
| # Licensed under the Apache License, Version 2.0 (the "License"); | |
| # you may not use this file except in compliance with the License. | |
| # You may obtain a copy of the License at | |
| # | |
| # http://www.apache.org/licenses/LICENSE-2.0 | |
| # | |
| # Unless required by applicable law or agreed to in writing, software | |
| # distributed under the License is distributed on an "AS IS" BASIS, | |
| # WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. | |
| # See the License for the specific language governing permissions and | |
| # limitations under the License. | |
| import toml | |
| from pathlib import Path | |
| import setuptools |
🤖 Prompt for AI Agents
In 3rdparty/NV-One-Logger-workspace/wrappers/nv-one-logger-otel/setup.py lines
1-4, the file is missing the required NVIDIA copyright header and imports an
unused module (`sys`); add the standard NVIDIA Python file header comment block
at the very top of the file and remove the unused `import sys` line so the file
only imports required modules (toml, Path, setuptools). Ensure the header is
placed before any imports and keep the remaining imports unchanged.
| # Add 3rdparty to path to support vendored nv-one-logger | ||
| import sys | ||
| _3rd_party_root = os.path.abspath(os.path.join(os.path.dirname(__file__), "../../3rdparty/nv-one-logger/nv_one_logger")) | ||
| if os.path.isdir(_3rd_party_root): | ||
| # We need to add the src directories of the individual packages to sys.path | ||
| _core_src = os.path.join(_3rd_party_root, "one_logger_core/src") | ||
| _otel_src = os.path.join(_3rd_party_root, "one_logger_otel/src") | ||
|
|
||
| if os.path.isdir(_core_src) and _core_src not in sys.path: | ||
| sys.path.append(_core_src) | ||
| if os.path.isdir(_otel_src) and _otel_src not in sys.path: | ||
| sys.path.append(_otel_src) | ||
|
|
||
| from nv_one_logger.api.config import OneLoggerConfig as BaseOneLoggerConfig | ||
| from nv_one_logger.api.one_logger_provider import OneLoggerProvider | ||
| from nv_one_logger.core.attributes import Attributes | ||
| from nv_one_logger.core.event import Event | ||
| from nv_one_logger.core.span import StandardSpanName | ||
| from nv_one_logger.otel.exporter.otel_exporter import OTelExporter | ||
| from nv_one_logger.recorder.default_recorder import DefaultRecorder | ||
| from opentelemetry.exporter.otlp.proto.http.trace_exporter import OTLPSpanExporter | ||
|
|
||
| from nemo_rl.data.interfaces import LLMMessageLogType | ||
| from nemo_rl.distributed.batched_data_dict import BatchedDataDict | ||
| HAVE_ONE_LOGGER = True | ||
|
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🧩 Analysis chain
🏁 Script executed:
# First, locate the file and check its size
find . -name "logger.py" -path "*/nemo_rl/utils/*" -type fRepository: NVIDIA-NeMo/RL
Length of output: 83
🏁 Script executed:
# Get the file size to determine how to read it
wc -l nemo_rl/utils/logger.py 2>/dev/null || echo "File not found, trying alternate path"Repository: NVIDIA-NeMo/RL
Length of output: 86
🏁 Script executed:
# Read the relevant section and context around lines 44-69
sed -n '1,100p' nemo_rl/utils/logger.py | cat -nRepository: NVIDIA-NeMo/RL
Length of output: 3945
🏁 Script executed:
# Search for HAVE_ONE_LOGGER usage throughout the file
rg "HAVE_ONE_LOGGER" nemo_rl/utils/logger.py -nRepository: NVIDIA-NeMo/RL
Length of output: 233
🏁 Script executed:
# Check the OneLogger class initialization and how it's used
rg -A 20 "class OneLogger" nemo_rl/utils/logger.pyRepository: NVIDIA-NeMo/RL
Length of output: 1404
🏁 Script executed:
# Look for any import error handling or try-except blocks around imports
sed -n '40,100p' nemo_rl/utils/logger.py | cat -nRepository: NVIDIA-NeMo/RL
Length of output: 2512
🏁 Script executed:
# Check Logger.__init__ to see how OneLogger backend is instantiated
sed -n '900,1000p' nemo_rl/utils/logger.py | cat -nRepository: NVIDIA-NeMo/RL
Length of output: 5019
🏁 Script executed:
# Look for where LoggerConfig one_logger field is accessed
rg -n "one_logger" nemo_rl/utils/logger.py | head -30Repository: NVIDIA-NeMo/RL
Length of output: 1620
🏁 Script executed:
# Check if there's any env var or fallback for one_logger config
rg -B5 -A5 "OneLogger(" nemo_rl/utils/logger.pyRepository: NVIDIA-NeMo/RL
Length of output: 128
🏁 Script executed:
# Check OneLogger.__init__ signature and usage of log_dir parameter
sed -n '464,520p' nemo_rl/utils/logger.py | cat -nRepository: NVIDIA-NeMo/RL
Length of output: 2558
🏁 Script executed:
# Verify OneLoggerConfigDict is correctly defined and its usage
sed -n '95,125p' nemo_rl/utils/logger.py | cat -nRepository: NVIDIA-NeMo/RL
Length of output: 1073
Wrap nv-one-logger imports in try-except to prevent import-time crash when package is unavailable.
Currently, the unconditional imports at lines 57–64 will raise ImportError before line 68 can execute, making HAVE_ONE_LOGGER ineffective and preventing nemo_rl/utils/logger.py from importing at all if nv-one-logger is neither vendored nor installed. The check in OneLogger.__init__ (line 468) cannot protect against this module-level failure.
# Add 3rdparty to path to support vendored nv-one-logger
import sys
_3rd_party_root = os.path.abspath(os.path.join(os.path.dirname(__file__), "../../3rdparty/nv-one-logger/nv_one_logger"))
if os.path.isdir(_3rd_party_root):
# We need to add the src directories of the individual packages to sys.path
_core_src = os.path.join(_3rd_party_root, "one_logger_core/src")
_otel_src = os.path.join(_3rd_party_root, "one_logger_otel/src")
- if os.path.isdir(_core_src) and _core_src not in sys.path:
- sys.path.append(_core_src)
- if os.path.isdir(_otel_src) and _otel_src not in sys.path:
- sys.path.append(_otel_src)
+ if os.path.isdir(_core_src) and _core_src not in sys.path:
+ sys.path.insert(0, _core_src)
+ if os.path.isdir(_otel_src) and _otel_src not in sys.path:
+ sys.path.insert(0, _otel_src)
-from nv_one_logger.api.config import OneLoggerConfig as BaseOneLoggerConfig
-from nv_one_logger.api.one_logger_provider import OneLoggerProvider
-from nv_one_logger.core.attributes import Attributes
-from nv_one_logger.core.event import Event
-from nv_one_logger.core.span import StandardSpanName
-from nv_one_logger.otel.exporter.otel_exporter import OTelExporter
-from nv_one_logger.recorder.default_recorder import DefaultRecorder
-from opentelemetry.exporter.otlp.proto.http.trace_exporter import OTLPSpanExporter
+try:
+ from nv_one_logger.api.config import OneLoggerConfig as BaseOneLoggerConfig
+ from nv_one_logger.api.one_logger_provider import OneLoggerProvider
+ from nv_one_logger.core.attributes import Attributes
+ from nv_one_logger.core.event import Event
+ from nv_one_logger.core.span import StandardSpanName
+ from nv_one_logger.otel.exporter.otel_exporter import OTelExporter
+ from nv_one_logger.recorder.default_recorder import DefaultRecorder
+ from opentelemetry.exporter.otlp.proto.http.trace_exporter import OTLPSpanExporter
+except ImportError:
+ HAVE_ONE_LOGGER = False
+else:
+ HAVE_ONE_LOGGER = True
from nemo_rl.data.interfaces import LLMMessageLogType
from nemo_rl.distributed.batched_data_dict import BatchedDataDict
-HAVE_ONE_LOGGER = True🤖 Prompt for AI Agents
In nemo_rl/utils/logger.py around lines 44 to 69, the unconditional imports from
nv_one_logger will raise ImportError at module import time and make
HAVE_ONE_LOGGER ineffective; wrap the nv_one_logger imports in a try/except
ImportError block so that if the package is unavailable you catch the
ImportError, set HAVE_ONE_LOGGER = False (and optionally record a debug/warning
via the standard logging module), and only set HAVE_ONE_LOGGER = True inside the
try block after successful imports; keep the existing sys.path adjustments
outside the try so vendored paths are still attempted before importing.
| class OneLoggerConfigDict(TypedDict): | ||
| project: str | ||
| name: str | ||
| enable_for_current_rank: NotRequired[bool] | ||
|
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Align TypedDict requirements with runtime behavior (avoid hidden defaults; enforce config presence when enabled).
OneLoggerConfigDict requires project/name, but Logger passes {} and OneLogger supplies fallback defaults in code, which conflicts with the “YAML is source of truth / no hidden defaults” guideline.
class OneLoggerConfigDict(TypedDict):
project: str
name: str
enable_for_current_rank: NotRequired[bool]
class LoggerConfig(TypedDict):
@@
one_logger_enabled: bool
@@
one_logger: NotRequired[OneLoggerConfigDict]And then enforce presence when enabled (see Logger.init comment below) and access cfg["project"] / cfg["name"] (see OneLogger.init comment below).
Also applies to: 106-118
🤖 Prompt for AI Agents
In nemo_rl/utils/logger.py around lines 95-99 (and similarly 106-118), the
TypedDict currently requires project/name but runtime passes {} and OneLogger
supplies fallbacks; update the TypedDict so project and name are NotRequired
(make them optional), then add a validation in Logger.__init__ that if logging
is enabled it raises a descriptive error when cfg lacks "project" or "name", and
finally update OneLogger.__init__ to assume presence and access cfg["project"]
and cfg["name"] (instead of using .get) after the Logger check ensures they
exist.
| class OneLogger(LoggerInterface): | ||
| """OneLogger backend.""" | ||
|
|
||
| def __init__(self, cfg: OneLoggerConfigDict, log_dir: Optional[str] = None): | ||
| if not HAVE_ONE_LOGGER: | ||
| print("Warning: nv_one_logger not installed. OneLogger will be disabled.") | ||
| return | ||
|
|
||
| self.provider = OneLoggerProvider.instance() | ||
| project = cfg.get("project", "nemo_rl") | ||
| name = cfg.get("name", "nemo_rl_run") | ||
|
|
||
| if not self.provider.one_logger_ready: | ||
| # Configure OneLogger | ||
| one_logger_config = BaseOneLoggerConfig( | ||
| application_name=project, | ||
| session_tag_or_fn=name, | ||
| world_size_or_fn=lambda: int(os.environ.get("WORLD_SIZE", 1)), | ||
| enable_for_current_rank=cfg.get("enable_for_current_rank", True), | ||
| ) | ||
|
|
||
| # Setup OTLP exporter | ||
| otel_endpoint = os.environ.get( | ||
| "OTEL_EXPORTER_OTLP_ENDPOINT", "http://localhost:4318/v1/traces" | ||
| ) | ||
| os.environ["OTEL_SERVICE_NAME"] = project | ||
| otel_exporter = OTelExporter( | ||
| otel_span_exporter=OTLPSpanExporter(endpoint=otel_endpoint) | ||
| ) | ||
|
|
||
| recorder = DefaultRecorder(exporters=[otel_exporter]) | ||
| self.provider.configure(one_logger_config, recorder) | ||
| print(f"Initialized OneLogger for project {project}, run {name}") | ||
|
|
||
| self.recorder = self.provider.recorder | ||
|
|
||
| # Ensure we have an active application span | ||
| app_spans = self.recorder.get_active_spans_by_name(StandardSpanName.APPLICATION) | ||
| if app_spans: | ||
| self.app_span = app_spans[0] | ||
| else: | ||
| self.app_span = self.recorder.start(StandardSpanName.APPLICATION) | ||
|
|
||
| def log_metrics( | ||
| self, | ||
| metrics: dict[str, Any], | ||
| step: int, | ||
| prefix: Optional[str] = "", | ||
| step_metric: Optional[str] = None, | ||
| ) -> None: | ||
| if not HAVE_ONE_LOGGER or not self.app_span: | ||
| return | ||
|
|
||
| attributes_dict = {"step": step} | ||
|
|
||
| for name, value in metrics.items(): | ||
| if prefix: | ||
| name = f"{prefix}/{name}" | ||
| # OneLogger attributes support basic types. | ||
| if isinstance(value, (int, float, str, bool)): | ||
| attributes_dict[name] = value | ||
| elif isinstance(value, torch.Tensor) and value.numel() == 1: | ||
| attributes_dict[name] = value.item() | ||
|
|
||
| # Record as an event | ||
| self.recorder.event( | ||
| span=self.app_span, | ||
| event=Event.create( | ||
| name="metrics_update", | ||
| attributes=Attributes(attributes_dict), | ||
| ), | ||
| ) | ||
|
|
||
| def log_hyperparams(self, params: Mapping[str, Any]) -> None: | ||
| if not HAVE_ONE_LOGGER or not self.app_span: | ||
| return | ||
|
|
||
| flat_params = flatten_dict(params) | ||
| # Convert to supported types | ||
| valid_params = {} | ||
| for k, v in flat_params.items(): | ||
| if isinstance(v, (int, float, str, bool)): | ||
| valid_params[k] = v | ||
| else: | ||
| valid_params[k] = str(v) | ||
|
|
||
| self.app_span.attributes.add_attributes(Attributes(valid_params)) | ||
|
|
||
| def log_histogram(self, histogram: list[Any], step: int, name: str) -> None: | ||
| pass | ||
|
|
||
| def log_plot(self, figure: plt.Figure, step: int, name: str) -> None: | ||
| pass | ||
|
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🧩 Analysis chain
🏁 Script executed:
# First, let's check the file structure and see if it exists
git ls-files nemo_rl/utils/logger.pyRepository: NVIDIA-NeMo/RL
Length of output: 81
🏁 Script executed:
# Read the specific file to verify current state
wc -l nemo_rl/utils/logger.pyRepository: NVIDIA-NeMo/RL
Length of output: 86
🏁 Script executed:
# Read the OneLogger class implementation
sed -n '464,557p' nemo_rl/utils/logger.py | cat -nRepository: NVIDIA-NeMo/RL
Length of output: 4085
🏁 Script executed:
# Check for config definitions - likely a TypedDict
rg "OneLoggerConfigDict" -A 5 -B 2Repository: NVIDIA-NeMo/RL
Length of output: 1301
🏁 Script executed:
# Check if there's a copyright header at the top of the file
head -20 nemo_rl/utils/logger.pyRepository: NVIDIA-NeMo/RL
Length of output: 728
🏁 Script executed:
# Check for @ray.remote decorators in the file
rg "@ray.remote" nemo_rl/utils/logger.pyRepository: NVIDIA-NeMo/RL
Length of output: 40
🏁 Script executed:
# Check for OpenTelemetry dependency version in pyproject.toml or requirements
fd -e toml -e txt | xargs grep -l "opentelemetry\|otel" 2>/dev/null | head -5Repository: NVIDIA-NeMo/RL
Length of output: 148
🏁 Script executed:
# Look for config schema or YAML files that define defaults
fd -e yaml | xargs grep -l "onelogger\|one_logger" -i 2>/dev/nullRepository: NVIDIA-NeMo/RL
Length of output: 118
🏁 Script executed:
# Check YAML examples for one_logger configuration
cat examples/configs/grpo_math_1B.yaml | grep -A 10 "one_logger"Repository: NVIDIA-NeMo/RL
Length of output: 436
🏁 Script executed:
# Check sft.yaml
cat examples/configs/sft.yaml | grep -A 10 "one_logger"Repository: NVIDIA-NeMo/RL
Length of output: 508
🏁 Script executed:
# Check pyproject.toml for OpenTelemetry dependency version
grep -A 2 -B 2 "opentelemetry\|otel" pyproject.tomlRepository: NVIDIA-NeMo/RL
Length of output: 837
🏁 Script executed:
# Check if log_dir is used anywhere in the class or as base class method
rg "log_dir" nemo_rl/utils/logger.pyRepository: NVIDIA-NeMo/RL
Length of output: 2942
🏁 Script executed:
# Verify step_metric usage in log_metrics
rg "step_metric" nemo_rl/utils/logger.py -A 3 -B 3Repository: NVIDIA-NeMo/RL
Length of output: 6620
Initialize all class members in __init__ path; don't skip initialization on early return.
On early return (line 5-7), self.app_span and self.recorder are never initialized. Later calls (lines 51, 66, 75, 87) then fail with AttributeError. Always initialize these members before any early return.
Also: For required config keys project and name, use direct access (cfg["project"]) instead of cfg.get() with defaults—the TypedDict marks them required. The unused step_metric parameter should be explicitly ignored with _ = step_metric.
🧰 Tools
🪛 Ruff (0.14.8)
467-467: Unused method argument: log_dir
(ARG002)
512-512: Unused method argument: step_metric
(ARG002)
| Parameters: | ||
| cfg (LoggerConfig): Configuration mapping. Expected keys include: | ||
| - "log_dir": base directory for backend logs. | ||
| - "wandb_enabled", "swanlab_enabled", "tensorboard_enabled", "mlflow_enabled": booleans to enable backends. | ||
| - "wandb", "swanlab", "tensorboard", "mlflow": per-backend configuration dicts. | ||
| - "log_dir": base directory for backend logs. | ||
| - "wandb_enabled", "swanlab_enabled", "tensorboard_enabled", "mlflow_enabled", "one_logger_enabled": booleans to enable backends. | ||
| - "wandb", "swanlab", "tensorboard", "mlflow", "one_logger": per-backend configuration dicts. | ||
| - "monitor_gpus": boolean to enable Ray GPU monitoring. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Docstring has duplicated bullets (“log_dir” repeated).
Clean up to avoid confusing rendered docs.
cfg (LoggerConfig): Configuration mapping. Expected keys include:
- "log_dir": base directory for backend logs.
- - "log_dir": base directory for backend logs.
- "wandb_enabled", "swanlab_enabled", "tensorboard_enabled", "mlflow_enabled", "one_logger_enabled": booleans to enable backends.
- "wandb", "swanlab", "tensorboard", "mlflow", "one_logger": per-backend configuration dicts.📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| Parameters: | |
| cfg (LoggerConfig): Configuration mapping. Expected keys include: | |
| - "log_dir": base directory for backend logs. | |
| - "wandb_enabled", "swanlab_enabled", "tensorboard_enabled", "mlflow_enabled": booleans to enable backends. | |
| - "wandb", "swanlab", "tensorboard", "mlflow": per-backend configuration dicts. | |
| - "log_dir": base directory for backend logs. | |
| - "wandb_enabled", "swanlab_enabled", "tensorboard_enabled", "mlflow_enabled", "one_logger_enabled": booleans to enable backends. | |
| - "wandb", "swanlab", "tensorboard", "mlflow", "one_logger": per-backend configuration dicts. | |
| - "monitor_gpus": boolean to enable Ray GPU monitoring. | |
| Parameters: | |
| cfg (LoggerConfig): Configuration mapping. Expected keys include: | |
| - "log_dir": base directory for backend logs. | |
| - "wandb_enabled", "swanlab_enabled", "tensorboard_enabled", "mlflow_enabled", "one_logger_enabled": booleans to enable backends. | |
| - "wandb", "swanlab", "tensorboard", "mlflow", "one_logger": per-backend configuration dicts. | |
| - "monitor_gpus": boolean to enable Ray GPU monitoring. |
🤖 Prompt for AI Agents
In nemo_rl/utils/logger.py around lines 951 to 957, the docstring parameter list
contains a duplicated bullet for "log_dir"; remove the duplicate entry (or
replace it with the correct intended key if a different config key was meant) so
the docstring lists each expected configuration key only once and update any
wording to keep the list consistent and unambiguous.
|
@terrykong I've integrated nv-one-logger:
let me know what you think, thanks! |
terrykong
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
thanks @therealnaveenkamal for the help adding one-logger. please see my review below and let me know if you have any questions
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
so for all the one logger dependencies, can we switch to VCS installs in pyproject.toml? see https://github.com/NVIDIA-NeMo/RL/blob/main/pyproject.toml#L154 for example
for context, the 3rdparty repo contains dependencies we expect to change a lot and require us to edit the code to have a monorepo experience. considering logging is probably not something we will change very often, i think it will be easier to maintain this as a regular python package.
| # Add 3rdparty to path to support vendored nv-one-logger | ||
| import sys | ||
| _3rd_party_root = os.path.abspath(os.path.join(os.path.dirname(__file__), "../../3rdparty/nv-one-logger/nv_one_logger")) | ||
| if os.path.isdir(_3rd_party_root): | ||
| # We need to add the src directories of the individual packages to sys.path | ||
| _core_src = os.path.join(_3rd_party_root, "one_logger_core/src") | ||
| _otel_src = os.path.join(_3rd_party_root, "one_logger_otel/src") | ||
|
|
||
| if os.path.isdir(_core_src) and _core_src not in sys.path: | ||
| sys.path.append(_core_src) | ||
| if os.path.isdir(_otel_src) and _otel_src not in sys.path: | ||
| sys.path.append(_otel_src) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
assuming this is moved to pyproject.toml as a regular VCS install, i think we can just assume one logger is installed to simplify this
| """OneLogger backend.""" | ||
|
|
||
| def __init__(self, cfg: OneLoggerConfigDict, log_dir: Optional[str] = None): | ||
| if not HAVE_ONE_LOGGER: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can remove
| project = cfg.get("project", "nemo_rl") | ||
| name = cfg.get("name", "nemo_rl_run") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
these default should come from yaml
| project = cfg.get("project", "nemo_rl") | |
| name = cfg.get("name", "nemo_rl_run") | |
| project = cfg["project"] | |
| name = cfg["name"] |
| one_logger_config = BaseOneLoggerConfig( | ||
| application_name=project, | ||
| session_tag_or_fn=name, | ||
| world_size_or_fn=lambda: int(os.environ.get("WORLD_SIZE", 1)), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
WORLD_SIZE doesn't have a value in this context b/c this is executed on the driver and there may be multiple "worlds" b/c multiple policy worker groups and the generation worker groups. Probably best thing to do here is num_nodes * gpus_per_node
| url = https://github.com/NVIDIA-NeMo/Gym.git | ||
| branch = main | ||
| shallow = true | ||
| [submodule "3rdparty/NV-One-Logger-workspace/nv-one-logger"] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
shouldn't be needed
| nv-one-logger-core = { workspace = true } | ||
| nv-one-logger-otel = { workspace = true } | ||
| nv-one-logger-training-telemetry = { workspace = true } | ||
| nv-one-logger-wandb = { workspace = true } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
for these vcs installs can probably use subdir syntax https://docs.astral.sh/uv/concepts/projects/dependencies/#git
|
|
||
| # Setup OTLP exporter | ||
| otel_endpoint = os.environ.get( | ||
| "OTEL_EXPORTER_OTLP_ENDPOINT", "http://localhost:4318/v1/traces" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
what's this for?
| return | ||
|
|
||
|
|
||
| class OneLogger(LoggerInterface): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
could you please add unit tests for this?
What does this PR do ?
Adds OneLogger (https://github.com/NVIDIA/nv-one-logger) support for NeMo-RL
Issues
Addresses #1152
Usage
Before your PR is "Ready for review"
Pre checks:
Summary by CodeRabbit
Release Notes
✏️ Tip: You can customize this high-level summary in your review settings.