Skip to content

Commit

Permalink
new data doc 📚
Browse files Browse the repository at this point in the history
  • Loading branch information
gaetanbrison committed Dec 3, 2024
1 parent 26e40b6 commit 353138b
Showing 1 changed file with 92 additions and 33 deletions.
125 changes: 92 additions & 33 deletions docs/carte_ai.data.html
Original file line number Diff line number Diff line change
Expand Up @@ -26,6 +26,8 @@
padding: 2px 4px;
border-radius: 3px;
}


</style>
</head>

Expand Down Expand Up @@ -67,32 +69,77 @@ <h1>carte_ai.data package<a class="headerlink" href="#carte-ai-data-package" tit
<h2>carte_ai.data.load_data module<a class="headerlink" href="#module-carte_ai.data.load_data" title="Link to this heading"></a></h2>

<!-- Spotify dataset -->

<dl class="py function">
<dt class="sig sig-object py" id="carte_ai.data.load_data.spotify">
<span class="sig-prename descclassname"><span class="pre">carte_ai.data.load_data.</span></span>
<span class="sig-name descname">spotify</span><span class="sig-paren">()</span>
<a class="headerlink" href="#carte_ai.data.load_data.spotify" title="Link to this definition"></a>
</dt>
<dd>
<p>Load and split the Spotify dataset.</p>
<p>This dataset contains information on over 600,000 Spotify tracks, including audio features and popularity metrics.</p>
<p><strong>Variables:</strong></p>
<ul>
<li><span class="variable-name">track_id</span>: Unique identifier for the track.</li>
<li><span class="variable-name">artists</span>: Name(s) of the artist(s).</li>
<li><span class="variable-name">album_name</span>: Name of the album.</li>
<li><span class="variable-name">track_name</span>: Name of the track.</li>
<li><span class="variable-name">popularity</span>: Popularity score of the track.</li>
<li><span class="variable-name">duration_ms</span>: Duration of the track in milliseconds.</li>
<li><span class="variable-name">danceability</span>: Danceability score (0.0 to 1.0).</li>
<li><span class="variable-name">energy</span>: Energy score (0.0 to 1.0).</li>
</ul>
<p><strong>Examples:</strong></p>
<pre><code>from carte_ai.data.load_data import spotify
df = spotify()
print(df.head())</code></pre>
</dd>
</dl>
<dt class="sig sig-object py" id="carte_ai.data.load_data.spotify">
<span class="sig-prename descclassname"><span class="pre">carte_ai.data.load_data.</span></span>
<span class="sig-name descname">spotify</span><span class="sig-paren">()</span>
<a class="headerlink" href="#carte_ai.data.load_data.spotify" title="Link to this definition"></a>
</dt>
<dd>
<p>Load and explore the <strong>Spotify</strong> dataset, which contains detailed information about over 600,000 Spotify tracks, including audio features, popularity metrics, and genres.</p>
<p>This dataset can be used for:</p>
<ul>
<li>Building a recommendation system based on user input or preferences.</li>
<li>Classification tasks using audio features and genres.</li>
<li>Any other applications involving music analysis and prediction.</li>
</ul>

<p><strong>Variables:</strong></p>
<ul>
<li><span class="variable-name" style="color: #d9534f; background-color: #f7f7f7; padding: 2px 6px; border-radius: 3px;">track_id</span>: Unique identifier for the track.</li>
<li><span class="variable-name" style="color: #d9534f; background-color: #f7f7f7; padding: 2px 6px; border-radius: 3px;">artists</span>: Names of the artists who performed the track (separated by ";").</li>
<li><span class="variable-name" style="color: #d9534f; background-color: #f7f7f7; padding: 2px 6px; border-radius: 3px;">album_name</span>: Name of the album.</li>
<li><span class="variable-name" style="color: #d9534f; background-color: #f7f7f7; padding: 2px 6px; border-radius: 3px;">track_name</span>: Name of the track.</li>
<li><span class="variable-name" style="color: #d9534f; background-color: #f7f7f7; padding: 2px 6px; border-radius: 3px;">popularity</span>: Popularity score (0–100).</li>
<li><span class="variable-name" style="color: #d9534f; background-color: #f7f7f7; padding: 2px 6px; border-radius: 3px;">duration_ms</span>: Length of the track in milliseconds.</li>
<li><span class="variable-name" style="color: #d9534f; background-color: #f7f7f7; padding: 2px 6px; border-radius: 3px;">explicit</span>: Whether the track contains explicit lyrics (true/false).</li>
<li><span class="variable-name" style="color: #d9534f; background-color: #f7f7f7; padding: 2px 6px; border-radius: 3px;">danceability</span>: Danceability score (0.0–1.0).</li>
<li><span class="variable-name" style="color: #d9534f; background-color: #f7f7f7; padding: 2px 6px; border-radius: 3px;">energy</span>: Energy score (0.0–1.0).</li>
<li><span class="variable-name" style="color: #d9534f; background-color: #f7f7f7; padding: 2px 6px; border-radius: 3px;">key</span>: Musical key of the track.</li>
<li><span class="variable-name" style="color: #d9534f; background-color: #f7f7f7; padding: 2px 6px; border-radius: 3px;">loudness</span>: Loudness in decibels (dB).</li>
<li><span class="variable-name" style="color: #d9534f; background-color: #f7f7f7; padding: 2px 6px; border-radius: 3px;">mode</span>: Modality of the track (major=1, minor=0).</li>
<li><span class="variable-name" style="color: #d9534f; background-color: #f7f7f7; padding: 2px 6px; border-radius: 3px;">speechiness</span>: Presence of spoken words (0.0–1.0).</li>
<li><span class="variable-name" style="color: #d9534f; background-color: #f7f7f7; padding: 2px 6px; border-radius: 3px;">acousticness</span>: Confidence measure for acoustic content (0.0–1.0).</li>
<li><span class="variable-name" style="color: #d9534f; background-color: #f7f7f7; padding: 2px 6px; border-radius: 3px;">instrumentalness</span>: Likelihood of being instrumental (0.0–1.0).</li>
<li><span class="variable-name" style="color: #d9534f; background-color: #f7f7f7; padding: 2px 6px; border-radius: 3px;">liveness</span>: Presence of audience (0.0–1.0).</li>
<li><span class="variable-name" style="color: #d9534f; background-color: #f7f7f7; padding: 2px 6px; border-radius: 3px;">valence</span>: Musical positiveness (0.0–1.0).</li>
<li><span class="variable-name" style="color: #d9534f; background-color: #f7f7f7; padding: 2px 6px; border-radius: 3px;">tempo</span>: Tempo in beats per minute (BPM).</li>
<li><span class="variable-name" style="color: #d9534f; background-color: #f7f7f7; padding: 2px 6px; border-radius: 3px;">time_signature</span>: Time signature (3–7).</li>
<li><span class="variable-name" style="color: #d9534f; background-color: #f7f7f7; padding: 2px 6px; border-radius: 3px;">track_genre</span>: Genre of the track.</li>
</ul>

<p><strong>Example Usage:</strong></p>
<div style="position: relative; padding: 5px; border-radius: 6px; font-family: Consolas, 'Courier New', monospace; overflow-x: auto; border: 1px solid #ddd; width: 100%; box-sizing: border-box; margin: 0;">
<!-- Copy Button -->
<button onclick="copyCode(this)" style="position: absolute; top: 5px; right: 5px; background-color: #007bff; color: white; border: none; padding: 3px 6px; font-size: 10px; border-radius: 4px; cursor: pointer;">
Copy
</button>
<!-- Code Block -->
<code id="codeBlock" style="display: block; margin: 0; padding: 0 0 0 10px; text-align: left;">
<br>
<span style="color: #af01db; font-weight: bold;">from</span> <span>carte_ai.data.load_data</span> <span style="color: #af01db; font-weight: bold;">import</span> <span>*</span>
<br>
<br>
<span>num_train</span> <span style="color: #DD0000; font-weight: bold;">=</span> <span style="color: #126544; font-weight: bold;">128</span> <span style="color: #0d8312;"># Example: set the number of training groups/entities</span>
<br>
<span>random_state</span> <span style="color: #DD0000; font-weight: bold;">=</span> <span style="color: #126544; font-weight: bold;">1</span> <span style="color: #0d8312;"># Set a random seed for reproducibility</span>
<br>
<br>
<span>X_train</span>, <span>X_test</span>, <span>y_train</span>, <span>y_test</span> <span style="color: #DD0000; font-weight: bold;">=</span> <span>spotify</span><span>(num_train, random_state)</span>
<br>
<br>
<span style="color: #0d8312;"># Print dataset shapes</span>
<br>
<span>print</span><span>(</span><span style="color: #a21515; font-weight: bold;">"Spotify dataset:"</span>, X_train.shape, X_test.shape)</span>
<br>
<br>
</code>

</div>
</dd>
</dl>


<!-- Wina_PL dataset -->
<dl class="py function">
Expand Down Expand Up @@ -131,15 +178,27 @@ <h2>carte_ai.data.load_data module<a class="headerlink" href="#module-carte_ai.d
Copy
</button>
<!-- Code Block -->
<code id="codeBlock" style="white-space: pre-wrap; display: block; margin: 0; padding: 0; text-align: left;">
<span style="color: #0000FF; font-weight: bold;">from</span> <span style="color: #228B22;">carte_ai.data.load_data</span> <span style="color: #0000FF; font-weight: bold;">import</span> <span>wina_pl</span>

<span style="color: #6c757d;"># Load the dataset</span>
<span>df</span> <span style="color: #DD0000; font-weight: bold;">=</span> <span>wina_pl</span><span>()</span>

<span style="color: #6c757d;"># Display the first few rows</span>
<span>print</span><span>(df.head())</span>
<code id="codeBlock" style="display: block; margin: 0; padding: 0 0 0 10px; text-align: left;">
<br>
<span style="color: #af01db; font-weight: bold;">from</span> <span>carte_ai.data.load_data</span> <span style="color: #af01db; font-weight: bold;">import</span> <span>*</span>
<br>
<br>
<span>num_train</span> <span style="color: #DD0000; font-weight: bold;">=</span> <span style="color: #126544; font-weight: bold;">128</span> <span style="color: #0d8312;"># Example: set the number of training groups/entities</span>
<br>
<span>random_state</span> <span style="color: #DD0000; font-weight: bold;">=</span> <span style="color: #126544; font-weight: bold;">1</span> <span style="color: #0d8312;"># Set a random seed for reproducibility</span>
<br>
<br>
<span>X_train</span>, <span>X_test</span>, <span>y_train</span>, <span>y_test</span> <span style="color: #DD0000; font-weight: bold;">=</span> <span>wina_pl</span><span>(num_train, random_state)</span>
<br>
<br>
<span style="color: #0d8312;"># Print dataset shapes</span>
<br>
<span>print</span><span>(</span><span style="color: #a21515; font-weight: bold;">"Wina Poland dataset:"</span>, X_train.shape, X_test.shape)</span>
<br>
<br>
</code>


</div>

<script>
Expand Down

0 comments on commit 353138b

Please sign in to comment.