You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
""":py:class:`~CompStats.interface.Perf` with :py:func:`~sklearn.metrics.f1_score` (as :py:attr:`score_func`) with the parameteres needed to compute the macro score. The parameters not described can be found in :py:func:`~sklearn.metrics.macro_f1`
310
+
""":py:class:`~CompStats.interface.Perf` with :py:func:`~sklearn.metrics.f1_score` (as :py:attr:`score_func`) with the parameteres needed to compute the macro score. The parameters not described can be found in :py:func:`~sklearn.metrics.f1_score`
311
311
312
312
:param y_true: True measurement or could be a pandas.DataFrame where column label 'y' corresponds to the true measurement.
""":py:class:`~CompStats.interface.Perf` with :py:func:`~sklearn.metrics.recall_score` (as :py:attr:`score_func`) with the parameteres needed to compute the macro score. The parameters not described can be found in :py:func:`~sklearn.metrics.macro_recall`
335
+
""":py:class:`~CompStats.interface.Perf` with :py:func:`~sklearn.metrics.recall_score` (as :py:attr:`score_func`) with the parameteres needed to compute the macro score. The parameters not described can be found in :py:func:`~sklearn.metrics.recall_score`
336
336
337
337
:param y_true: True measurement or could be a pandas.DataFrame where column label 'y' corresponds to the true measurement.
""":py:class:`~CompStats.interface.Perf` with :py:func:`~sklearn.metrics.precision_score` (as :py:attr:`score_func`) with the parameteres needed to compute the macro score. The parameters not described can be found in :py:func:`~sklearn.metrics.macro_precision`
360
+
""":py:class:`~CompStats.interface.Perf` with :py:func:`~sklearn.metrics.precision_score` (as :py:attr:`score_func`) with the parameteres needed to compute the macro score. The parameters not described can be found in :py:func:`~sklearn.metrics.precision_score`
361
361
362
362
:param y_true: True measurement or could be a pandas.DataFrame where column label 'y' corresponds to the true measurement.
Copy file name to clipboardExpand all lines: quarto/CompStats.qmd
+23-9Lines changed: 23 additions & 9 deletions
Original file line number
Diff line number
Diff line change
@@ -18,7 +18,7 @@ execute:
18
18
Collaborative competitions have gained popularity in the scientific and technological fields. These competitions involve defining tasks, selecting evaluation scores, and devising result verification methods. In the standard scenario, participants receive a training set and are expected to provide a solution for a held-out dataset kept by organizers. An essential challenge for organizers arises when comparing algorithms' performance, assessing multiple participants, and ranking them. Statistical tools are often used for this purpose; however, traditional statistical methods often fail to capture decisive differences between systems' performance. CompStats implements an evaluation methodology for statistically analyzing competition results and competition. CompStats offers several advantages, including off-the-shell comparisons with correction mechanisms and the inclusion of confidence intervals.
19
19
:::
20
20
21
-
::: {.card title='Installing using conda'}
21
+
::: {.card title='Installing using conda' .flow}
22
22
23
23
`CompStats` can be install using the conda package manager with the following instruction.
A more general approach to installing `CompStats` is through the use of the command pip, as illustrated in the following instruction.
32
32
33
33
```{sh}
@@ -41,8 +41,12 @@ pip install CompStats
41
41
42
42
To illustrate the use of `CompStats`, the following snippets show an example. The instructions load the necessary libraries, including the one to obtain the problem (e.g., digits), four different classifiers, and the last line is the score used to measure the performance and compare the algorithm.
43
43
44
+
Belowe the imports, it is found the code to load the digits problem and split the dataset into training and validation sets.
45
+
46
+
::: {.card title="Dataset and libraries" .flow}
44
47
```{python}
45
48
#| echo: true
49
+
#| code-fold: true
46
50
47
51
from sklearn.svm import LinearSVC
48
52
from sklearn.naive_bayes import GaussianNB
@@ -52,42 +56,51 @@ from sklearn.datasets import load_digits
52
56
from sklearn.model_selection import train_test_split
53
57
from sklearn.base import clone
54
58
from CompStats.metrics import f1_score
59
+
60
+
X, y = load_digits(return_X_y=True)
61
+
_ = train_test_split(X, y, test_size=0.3)
62
+
X_train, X_val, y_train, y_val = _
55
63
```
64
+
:::
56
65
57
-
The first step is to load the digits problem and split the dataset into training and validation sets. The second step is to estimate the parameters of a linear Support Vector Machine and predict the validation set's classes. The predictions are stored in the variable `hy`.
66
+
The first line estimates the parameters of a linear Support Vector Machine and predict the validation set's classes. The predictions are stored in the variable `hy`.
58
67
68
+
::: {.card title="Linear SVM" .flow}
59
69
```{python}
60
70
#| echo: true
61
71
62
-
X, y = load_digits(return_X_y=True)
63
-
_ = train_test_split(X, y, test_size=0.3)
64
-
X_train, X_val, y_train, y_val = _
65
72
m = LinearSVC().fit(X_train, y_train)
66
73
hy = m.predict(X_val)
67
74
```
75
+
:::
68
76
69
77
Once the predictions are available, it is time to measure the algorithm's performance, as seen in the following code. It is essential to note that the API used in `sklearn.metrics` is followed; the difference is that the function returns an instance with different methods that can be used to estimate different performance statistics and compare algorithms.
70
78
71
-
## Column
72
-
79
+
::: {.card title="Score" .flow}
73
80
```{python}
74
81
#| echo: true
75
82
76
83
score = f1_score(y_val, hy, average='macro')
77
84
score
78
85
```
86
+
:::
87
+
88
+
## Column
79
89
80
90
Continuing with the example, let us assume that one wants to test another classifier on the same problem, in this case, a random forest, as can be seen in the following two lines. The second line predicts the validation set and sets it to the analysis.
0 commit comments