Skip to content

Commit 9357ac6

Browse files
committed
udpate path images
1 parent c0e8555 commit 9357ac6

File tree

2 files changed

+10
-10
lines changed

2 files changed

+10
-10
lines changed

content/posts/MaNo.md

Lines changed: 9 additions & 9 deletions
Original file line numberDiff line numberDiff line change
@@ -123,7 +123,7 @@ Now you understand that logits are very important for generalisation performance
123123
Mathematically, softmax is defined as:
124124
</p>
125125
<figure id="my-fig" class="numbered" style="display: inline-block; vertical-align: middle; margin-left: 10px;">
126-
<img src="/content/images/Mano/softmax_img.png" class="align-center" style="width: 250px; height: auto;">
126+
<img src="/images/Mano/softmax_img.png" class="align-center" style="width: 250px; height: auto;">
127127
</figure>
128128

129129
However, this approach has a major issue: it is **sensitive to prediction bias** and can lead to **overconfidence**. In other words, if a model generates very high logits for a class (indicating strong confidence in its prediction), but that prediction is incorrect, it can skew the results. This phenomenon is largely due to the **exponential function** in the softmax formula, which amplifies the differences between logits. This can lead to significant errors, especially when the model is overly confident without being accurate.
@@ -135,7 +135,7 @@ To address this challenge, the paper introduces **MANO**, a novel method that le
135135
## **Introducing MANO: A Two-Step Approach** {#section-1}
136136
MANO addresses these challenges through a two-step process: **Normalization with Softrun** and **Aggregation using Matrix Norms**. Here is a scheme so you can visualize the process :
137137

138-
![Mano schema](/content/images/Mano/Mano_schema.png)
138+
![Mano schema](/images/Mano/Mano_schema.png)
139139

140140
### **1. Normalization with Softrun** {#section-1.1}
141141
As explained before, Softmax is a very common activation function to transform logits into probabilities. But its exponential nature exaggerates differences between logits, making the model appear more confident than it actually is.
@@ -154,14 +154,14 @@ $$ \sigma(q_i) = \frac{v(q_i)}{\sum_{k=1}^{K} v(q_i)_k} \in \Delta_K$$
154154
Thanks to $\Phi(\mathcal{D}_{test})$, it will determine whether to apply a Taylor or softmax normalization term. The function $v(q)$ is defined as:
155155

156156
<figure id="my-fig_eq_v" class="numbered" >
157-
<img src="/content//images/Mano/equation_v.png" class="align-center">
157+
<img src="/images/Mano/equation_v.png" class="align-center">
158158
<p style="text-align: center;"></p>
159159
</figure>
160160

161161
When the model’s predictions are unreliable, Softrun applies a Taylor approximation rather than the softmax. The Taylor approximation smooths out the effect of large logits, preventing the model from being overly confident in any particular prediction. By contrast, when the dataset is well-calibrated, the function behaves like softmax, preserving probability distributions where confidence is warranted.
162162

163163
<figure id="my-fig" class="numbered" style="float: left; margin-left: 10px; width: 50%;">
164-
<img src="/content/images/Mano/Lp_norm_schema.png" class="align-center" style="width: 100%; height: auto;">
164+
<img src="/images/Mano/Lp_norm_schema.png" class="align-center" style="width: 100%; height: auto;">
165165
<p style="text-align: center;"></p>
166166
</figure>
167167

@@ -179,7 +179,7 @@ Besides, the output of this first step is scaled logits: $Q_i = \sigma(q_i) \in
179179
After normalization, MANO **aggregates** the logits using the **Lp norm** of the matrix $Q$, defined as:
180180

181181
<figure id="my-fig" class="numbered" >
182-
<img src="/content/images/Mano/equation_s.png" class="align-center">
182+
<img src="/images/Mano/equation_s.png" class="align-center">
183183
<p style="text-align: center;"></p>
184184
</figure>
185185

@@ -199,7 +199,7 @@ One of the main advantages of the Lp​ norm over the Nuclear Norm is its **comp
199199
**Effect of p on Aggregation Sensitivity**
200200

201201
<figure id="my-fig" class="numbered" style="float: right; margin-right: 10px; width: 45%;">
202-
<img src="/content/images/Mano/impact_Lp_norm.png" class="align-center" style="width: 100%; height: auto;">
202+
<img src="/images/Mano/impact_Lp_norm.png" class="align-center" style="width: 100%; height: auto;">
203203
<p style="text-align: center;"></p>
204204
</figure>
205205

@@ -217,7 +217,7 @@ Let's see now how to implement MANO in practice!
217217

218218
Before diving into implementation, it’s important to understand the logic behind the MANO algorithm for unsupervised accuracy estimation.
219219

220-
![Algorithm 1: MANO Pseudocode](/content//images/Mano/algorithm_mano.png)
220+
![Algorithm 1: MANO Pseudocode](/images/Mano/algorithm_mano.png)
221221

222222
The pseudocode above outlines the core procedure: given a model and an unlabeled test set, the method first determines the best way to normalize the model's logits, either using softmax or the novel alternative softrun on an entropy-based criterion (see [Section 1.1](#section-1.1)). Then, it iterates over each sample in the test set, collects the normalized predictions into a matrix, and finally computes an estimation score using the matrix’s normalized L_p norm (see [Section 1.2](#section-1.2)). This score correlates with the model's true accuracy, even without access to ground truth labels.
223223

@@ -271,7 +271,7 @@ MANO has been evaluated against **11 baseline methods**, including Rotation Pred
271271
In this comprehensive evaluation, the authors have considered 3 types of distribution shifts: **synthetic shifts**, where models were tested against artificially corrupted images; **natural shifts**, which involved datasets collected from different distributions than the training data; and **subpopulation shifts**, where certain classes or groups were underrepresented in the training data. To evaluate Mano under synthetic shifts, the authors have used CIFAR-10C, CIFAR-100C, ImageNet-C, and TinyImageNet-C, covering various corruption types and severity levels. For natural shifts, they tested on OOD datasets from PACS, Office-Home, DomainNet, and RR1 WILDS. To assess subpopulation shifts, they used the BREEDS benchmark, including Living-17, Nonliving-26, Entity-13, and Entity-30 from ImageNet-C.
272272

273273
<figure id="my-fig" class="numbered" style="float: left; margin-left: 10px; width: 45%;">
274-
<img src="/content/images/Mano/R2_scores.png" class="align-center" style="width: 100%; height: auto;">
274+
<img src="/images/Mano/R2_scores.png" class="align-center" style="width: 100%; height: auto;">
275275
<!-- <p style="text-align: center;">$R^2$ distribution ResNet18 on all distribution shifts </p> -->
276276
</figure>
277277

@@ -280,7 +280,7 @@ On the left, we can see a box plot of $R^2$ distribution showing the estimation
280280
Additionally, in the figure below, we can see a scatter plot illustrating the outperforming results of Mano on natural shift compared to Dispersion Score and ProjNorm on Entity-18 using ResNet-18.
281281

282282
<figure id="my-fig" class="numbered">
283-
<img src="/content/images/Mano/results_plot.png" class="align-center" style="width: 100%; height: auto;">
283+
<img src="/images/Mano/results_plot.png" class="align-center" style="width: 100%; height: auto;">
284284
<p style="text-align: center;"></p>
285285
</figure>
286286

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1 +1 @@
1-
*,:after,:before{box-sizing:border-box;padding:0}body{font:1rem/1.5 '-apple-system',BlinkMacSystemFont,avenir next,avenir,helvetica,helvetica neue,ubuntu,roboto,noto,segoe ui,arial,sans-serif;text-rendering:optimizeLegibility;-webkit-font-smoothing:antialiased;-moz-osx-font-smoothing:grayscale;padding:2rem;background:#f5f5f5;color:#000}.skip-link{position:absolute;top:-40px;left:0;background:#eee;z-index:100}.skip-link:focus{top:0}h1,h2,h3,h4,h5,strong,b{font-size:inherit;font-weight:600}header{line-height:2;padding-bottom:1.5rem}.link{overflow:hidden;text-overflow:ellipsis;white-space:nowrap;overflow:hidden;text-overflow:ellipsis;text-decoration:none}.time{font-variant-numeric:tabular-nums;white-space:nowrap}blockquote{border-left:5px solid #eee;padding-left:1rem;margin:0}a,a:visited{color:inherit}a:hover,a.heading-link{text-decoration:none}pre{padding:.5rem;overflow:auto;overflow-x:scroll;overflow-wrap:normal}code,pre{font-family:San Francisco Mono,Monaco,consolas,lucida console,dejavu sans mono,bitstream vera sans mono,monospace;font-size:normal;font-size:small;background:#eee}code{margin:.1rem;border:none}ul{list-style-type:square}ul,ol{padding-left:1.2rem}.list{line-height:2;list-style-type:none;padding-left:0}.list li{padding-bottom:.1rem}.meta{color:#777}.content{max-width:70ch;margin:0 auto}header{line-height:2;display:flex;justify-content:space-between;padding-bottom:1rem}header a{text-decoration:none}header ul{list-style-type:none;padding:0}header li,header a{display:inline}h2.post{padding-top:.5rem}header ul a:first-child{padding-left:1rem}.nav{height:1px;background:#000;content:'';max-width:10%}.list li{display:flex;align-items:baseline}.list li time{flex:initial}.hr-list{margin-top:0;margin-bottom:0;margin-right:.5rem;margin-left:.5rem;height:1px;border:0;border-bottom:1px dotted #ccc;flex:1 0 1rem}.m,hr{border:0;margin:3rem 0}img{max-width:100%;height:auto}.post-date{margin:5% 0}.index-date{color:#9a9a9a}.animate-blink{animation:opacity 1s infinite;opacity:1}@keyframes opacity{0%{opacity:1}50%{opacity:.5}100%{opacity:0}}.tags{display:flex;justify-content:space-between}.tags ul{padding:0;margin:0}.tags li{display:inline}.avatar{height:120px;width:120px;position:relative;margin:-10px 0 0 15px;float:right;border-radius:50%}
1+
*,:after,:before{box-sizing:border-box;padding:0}body{font:1rem/1.5 '-apple-system',BlinkMacSystemFont,avenir next,avenir,helvetica,helvetica neue,ubuntu,roboto,noto,segoe ui,arial,sans-serif;text-rendering:optimizeLegibility;-webkit-font-smoothing:antialiased;-moz-osx-font-smoothing:grayscale;padding:2rem;background:#f5f5f5;color:#000}.skip-link{position:absolute;top:-40px;left:0;background:#eee;z-index:100}.skip-link:focus{top:0}h1,h2,h3,h4,h5,strong{font-size:inherit;font-weight:600}header{line-height:2;padding-bottom:1.5rem}.link{overflow:hidden;text-overflow:ellipsis;white-space:nowrap;overflow:hidden;text-overflow:ellipsis;text-decoration:none}.time{font-variant-numeric:tabular-nums;white-space:nowrap}blockquote{border-left:5px solid #eee;padding-left:1rem;margin:0}a,a:visited{color:inherit}a:hover,a.heading-link{text-decoration:none}pre{padding:.5rem;overflow:auto;overflow-x:scroll;overflow-wrap:normal}code,pre{font-family:San Francisco Mono,Monaco,consolas,lucida console,dejavu sans mono,bitstream vera sans mono,monospace;font-size:normal;font-size:small;background:#eee}code{margin:.1rem;border:none}ul{list-style-type:square}ul,ol{padding-left:1.2rem}.list{line-height:2;list-style-type:none;padding-left:0}.list li{padding-bottom:.1rem}.meta{color:#777}.content{max-width:70ch;margin:0 auto;text-align:justify}header{line-height:2;display:flex;justify-content:space-between;padding-bottom:1rem}header a{text-decoration:none}header ul{list-style-type:none;padding:0}header li,header a{display:inline}h2.post{padding-top:.5rem}header ul a:first-child{padding-left:1rem}.nav{height:1px;background:#000;content:'';max-width:10%}.list li{display:flex;align-items:baseline}.list li time{flex:initial}.hr-list{margin-top:0;margin-bottom:0;margin-right:.5rem;margin-left:.5rem;height:1px;border:0;border-bottom:1px dotted #ccc;flex:1 0 1rem}.m,hr{border:1;border-style:dashed;margin:3rem 0}img{max-width:100%;height:auto}.post-date{margin:5% 0}.index-date{color:#9a9a9a}.animate-blink{animation:opacity 1s infinite;opacity:1}@keyframes opacity{0%{opacity:1}50%{opacity:.5}100%{opacity:0}}.tags{display:flex;justify-content:space-between}.tags ul{padding:0;margin:0}.tags li{display:inline}.avatar{height:120px;width:120px;position:relative;margin:-10px 0 0 15px;float:right;border-radius:50%}

0 commit comments

Comments
 (0)