Info extractor rework #890

dyc3 · 2023-05-15T22:22:44Z

The info extractor has some problems that make implementing new features hard.

URL parsing is too tightly coupled to metadata extraction.
There isn't a clean way to obtain the URL that was resolved into a Video or array of Video
Collections are not as clearly defined as Video is, resulting in lots of duplicate code.
ServiceAdapters don't have a clean, easy to follow mechanism to provide Videos where service != adapter.serviceId
There isn't a clean, super type-checked way to provide startAt and endAt metadata that is embedded into the url (eg. youtube's t=5s query parameter for starting the video at 5 seconds)
"highlighting" a video in an add preview is incredibly janky, (and i think its broken right now)
- highlighting is a feature that shows the referenced video at the very top of the list if you put in a youtube link with a playlist and a video referenced (eg. https://www.youtube.com/watch?v=OtqD0ddbLic&list=PLqKpXXbHNvC83j0C8UsR817qcrXupLGZ5)

I want the new info extractor to be able to:

resolve all the above issues
do pagination (Add Preview Pagination #172)
provide extra metadata for queue items (eg startAt and endAt)
maintain all the caching and bandwidth saving logic we already have
- maybe make it possible to cache add previews locally
be able to cache collections (playlists, channels, etc.)

So far, what I'm thinking, when diagrammed out, is this:

classDiagram

    class VideoId {
        service: string
        id: string
    }

    class Video {
        id: VideoId
        meta: VideoMetadata
    }

    Video <-- VideoId

    class CollectionId {
        service: string
        id: string
        page: string
    }

    class ExtractStrategy {
        <<abstract>>
        resolve(url) Video[]
    }

    ExtractStrategy <|-- Single
    ExtractStrategy <|-- Multi
    Multi <|-- Collection

    class ServiceAdapter {
        <<abstract>>
        canHandleUrl(url) boolean
        isCollectionUrl(url) boolean
        parseSingle(url) VideoId
        fetchSingle(videoId) Video
        fetchMulti(videoId[]) Video[]
        parseCollection(url) CollectionId
        fetchCollection(collectionId) VideoId[]
    }

    Youtube <|-- ServiceAdapter
    Vimeo <|-- ServiceAdapter
    Direct <|-- ServiceAdapter

This isn't perfect or complete, and I'll be refining it and updating this issue.

Some notes:

In CollectionId, page is a string so that it can either be a number or a page token (youtube api uses those)
ServiceAdapter should provide a default, overrideable impl for fetchMulti(), allowing derived adapters to override it with something more optimized if applicable.

The text was updated successfully, but these errors were encountered:

dyc3 added help wanted Extra attention is needed server Improvements or additions to the server refactor Something needs to be reworked to improve code quality labels May 15, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Info extractor rework #890

Info extractor rework #890

dyc3 commented May 15, 2023 •

edited

Loading

Info extractor rework #890

Info extractor rework #890

Comments

dyc3 commented May 15, 2023 • edited Loading

dyc3 commented May 15, 2023 •

edited

Loading