Feasibility of Multi-tenant JDT LS & Memory Optimization Strategies (Bachelor Thesis @TUM) #3612
Replies: 3 comments 1 reply
-
Yes, it is.
Yes, although it's less important (if the workspace and JDT model were multi-tenant, multiple used could use the same OSGi instance and just access the information that's visible to them).
I think it's theorically already possible to provision a project and/or a workspace with pre-build indexes for jars so there is no need to read the jars (the pre-built index is directly pushed to the local index). You'd like to read https://www.eclipse.org/lists/jdt-dev/msg01688.html and look at ClasspathEntry.getIndexLibraryLocation() to get a sense of how to use it. |
Beta Was this translation helpful? Give feedback.
-
Not as far as I know. The index is IIRC loaded fully in memory; and backed up for reuse upon restarts on filesystem. I don't think further optimizations are made; but that's worth investigating, a good topic to address in a bachelor thesis IMO. An alternative to the explicit shared index, if everyone starts the same project, it could also be possible to create your container of JDT-LS so that the project is already loaded in it and processed at leaast once: jdt(-ls?) already started with this project and some basic operation already run ahead of actual usage, when creating the container. This choice should allow the Eclipse IDE to bootstrap many models (the workspace, the index, the build model, the java model...) and ensure they are stored on the filesystem to optimize next startup. |
Beta Was this translation helpful? Give feedback.
-
|
Since you're using Theia, you can enable App CDS in vscode-java with |
Beta Was this translation helpful? Give feedback.
Uh oh!
There was an error while loading. Please reload this page.
-
Feasibility of Multi-tenant JDT LS & Memory Optimization Strategies (Bachelor Thesis)
Hi everyone,
I am currently working on my Bachelor’s Thesis at the Technical University of Munich. My research focuses on optimizing resource efficiency for cloud-based development environments (Eclipse Theia) in large-scale educational settings.
The Context
We deploy Theia on Kubernetes for programming courses where hundreds of students often work on the exact same "starter project" simultaneously. Currently, we provision a dedicated container with a separate JDT LS process for every user. Scaling this to thousands of concurrent students results in massive memory consumption due to the overhead of thousands of JVMs running in parallel.
My Architectural Goal
My initial thesis proposal involved designing a "Shared JVM" architecture, where a single centralized JVM hosts multiple, logically isolated JDT LS sessions for different users to reduce the JVM memory overhead.
Question 1: Feasibility of a Shared JVM
Based on my analysis of the Eclipse/OSGi architecture, I suspect this approach is technically impossible because:
Could you briefly confirm if this assessment is correct? Is strictly one JVM process required per JDT LS session?
Question 2: Shared Indexes for Identical Projects
Since many of our users work on identical starter projects with the same dependencies (JDK + Libraries), every single JDT LS instance currently computes and stores the exact same index data.
Question 3: Application Class Data Sharing (AppCDS)
To mitigate the overhead of running many separate JVMs, I am considering using Java AppCDS.
General Recommendations
Beyond the specific points above, I would greatly appreciate any other suggestions or "low-hanging fruits" regarding configuration flags, JVM tuning, or architectural approaches to further reduce the memory footprint and file size of the language server in a containerized environment.
Any insights, architectural constraints, or pointers to relevant documentation would be incredibly helpful for my research.
Best regards,
Nikolas Hack
Technical University of Munich
Beta Was this translation helpful? Give feedback.
All reactions