Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add query for cancelled trips to GTFS GraphQL API #5393

Merged
merged 57 commits into from
Dec 16, 2024

Conversation

vesameskanen
Copy link
Contributor

Summary

This PR adds a query, which returns information about cancelled trips, into Gtfs graphQL API. The query supports paging and filtering by feed id. It returns a list of (trip, date) pairs.

Unit tests

None. The new code has been tested manually using graphiQL endpoint. Changing trip selector in DefaultTransitService class temporarily to return UPDATED (not CANCELED) trips helps to collect test trips with any realtime modified data.

Documentation

Docs in the graphQL schema

@vesameskanen vesameskanen requested a review from a team as a code owner October 4, 2023 14:11
@codecov
Copy link

codecov bot commented Oct 4, 2023

Codecov Report

Attention: Patch coverage is 70.54264% with 76 lines in your changes missing coverage. Please review.

Project coverage is 69.81%. Comparing base (e919778) to head (5217ce5).
Report is 194 commits behind head on dev-2.x.

Files with missing lines Patch % Lines
...ntripplanner/apis/gtfs/generated/GraphQLTypes.java 0.00% 19 Missing ⚠️
...planner/transit/service/DefaultTransitService.java 31.81% 12 Missing and 3 partials ⚠️
...ipplanner/apis/gtfs/datafetchers/StopCallImpl.java 66.66% 4 Missing and 6 partials ⚠️
.../apis/gtfs/datafetchers/TripOnServiceDateImpl.java 82.97% 4 Missing and 4 partials ⚠️
...ntripplanner/apis/gtfs/mapping/PickDropMapper.java 28.57% 4 Missing and 1 partial ⚠️
...pentripplanner/apis/gtfs/datafetchers/LegImpl.java 20.00% 4 Missing ⚠️
...g/opentripplanner/ext/flex/FlexibleTransitLeg.java 0.00% 2 Missing ⚠️
...va/org/opentripplanner/apis/gtfs/GraphQLUtils.java 33.33% 1 Missing and 1 partial ⚠️
...fs/datafetchers/CallScheduledTimeTypeResolver.java 66.66% 1 Missing and 1 partial ⚠️
...tfs/datafetchers/CallStopLocationTypeResolver.java 66.66% 1 Missing and 1 partial ⚠️
... and 4 more
Additional details and impacted files
@@              Coverage Diff              @@
##             dev-2.x    #5393      +/-   ##
=============================================
+ Coverage      69.78%   69.81%   +0.02%     
- Complexity     17781    17856      +75     
=============================================
  Files           2017     2030      +13     
  Lines          76040    76234     +194     
  Branches        7781     7799      +18     
=============================================
+ Hits           53067    53222     +155     
- Misses         20269    20288      +19     
- Partials        2704     2724      +20     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@leonardehrenfried leonardehrenfried changed the title New query for cancelled trips into Gtfs graphQL API Add query for cancelled trips to GTFS GraphQL API Oct 5, 2023
@vesameskanen vesameskanen added GTFS Related to import of GTFS data Improvement labels Oct 6, 2023
@t2gran t2gran added this to the 2.5 (next release) milestone Oct 10, 2023
@leonardehrenfried leonardehrenfried marked this pull request as draft October 12, 2023 14:00
@vesameskanen
Copy link
Contributor Author

My current understanding of what should be done:

  • Change the serviceDate of DatedTrip record to the new date+time scalar Joel is working on (i.e return a single datetime scalar instead of separate date and departure time )
  • Add Trip fields which are represented in bad way which we wish to deprecate to DatedTrip
  • Add fields which we expect to depend on the resolved realtime context to DatedTrip (headsign? realtime status?)

While doing this, we should not add features which are not needed yet, or unnecessarily duplicated fields of the Trip. In my opinion, duplicating a field from the Trip object to DatedTrip level indicates that the values presumably differ.

@leonardehrenfried
Copy link
Member

Yes, I would keep the implementation of DatedTrip to an absolute minimum and only include what you need right now.

@optionsome however thinks that you should include stop times. I can live without them and can add them when I need it.

@t2gran t2gran modified the milestones: 2.5, 2.6 (next release) Mar 13, 2024
@optionsome optionsome marked this pull request as ready for review August 8, 2024 13:00
@leonardehrenfried
Copy link
Member

@vesameskanen can you be the second reviewer?

@t2gran
Copy link
Member

t2gran commented Nov 26, 2024

There has been a long discussion about the API schema above. I just want to summarise my input for future reference. I quickly put together an "API domain model" for this. The diagram below is my suggestion to the design, it follows the guideline in this PR.

In the future I want us to draw diagrams like the on below - they are much easier to read and discuss. If we draw at a API level or OTP domain level probably does not matter - but try to focus on the entities, value-objects and the relationships. When we agree on the "language", then it become a technical task to map it to a GraphQL schema or Java model.

Diagram
EstimatedCall

  1. Stuff in red exist in OTP, but should not be included in the API now - only when it is requested.
  2. TripOnDate - Service is not part of the name, because there is no ambiguity - having TripOnRunningDate does not make any sence.
  3. There is no support for flex in the first version, but letting stop be of type StopLocation and scheduledTime optional allow us to reuse this if we want. The is a big IF, but the point is to be able to - not forced. There are open questions:
    1. Can the scheduled stop be e.g. a GroupStop and the estimated/actual stop be a RegularStop - I think this make perfect sence - but the business rules is not known for RT on FLEX.
    2. Flex estinated/actual times should not be a window in my view. If a plan for the flex trip exist the route should be planned and there should be estimates for when to arrive where. There might be more stop-locations than in the scheduled plan - but again this must be analysed. My guess is that we will not be able to guess the final solution here today, so it is pointless to put to much of it in the design.
  4. scheduledTime should be optimal, ADDED Trips does not have a schedule - we face it today, but I am not sure if this is ok. Also, for Flex the scheduled time might be a time-instant or a time-window.
  5. I would like to find a better name for ArrivalDepartureTime, but has not done it. I like to find good names for value-objects types - these are the easiest types to reuse and they do not need to change. If a change is need, the entity using it can be changed instead. Value-objects are also easy to use for the client, reusing logic to handle it.
  6. Departure and arrival-times are not required. This is also a simplification we have done in the internal model witch has leaked out. I am not worried here, and it might be just much easier for the client if we say they are required. If needed we can add information about the origin [OTP generated/feed] of these fields instead.
  7. I have focused on the "domain" not the "graph QL" schema - I have not included all fields, and some names are probably wrong.
  8. Note! there is no interface for the ArrivalDepartureTime, ArrivalDepartureTimeWindow and EstimatedArrivalDepartureTimeArrivalDepartureEstimatedTime - these types does not have anything in common. You could argue that there should be a union for the scheduled time/window - I would suggest it should be CallScheduledTime - it belong to Call and should not be reused in other places, so call need to be part of the name. I suggest using the owner type as a prefix.
  9. Using union for call->stop is also ok for example CallStopLocation.

@leonardehrenfried
Copy link
Member

@optionsome You have conflicts again.

@optionsome
Copy link
Member

2. `TripOnDate` - `Service` is not part of the name, because there is no ambiguity - having `TripOnRunningDate` does not make any sence.

I think people can confuse the concept of a normal date vs service date.

3. There is no support for flex in the first version, but letting stop be of type StopLocation and scheduledTime optional allow us to reuse this if we want. The is a big IF, but the point is to be able to - not forced. There are open questions:
   
   1. Can the scheduled stop be e.g. a GroupStop and the estimated/actual stop be a RegularStop - I think this make perfect sence - but the business rules is not known for RT on FLEX.
   2. Flex estinated/actual times should not be a window in my view. If a plan for the flex trip exist the route should be planned and there should be estimates for when to arrive where. There might be more stop-locations than in the scheduled plan - but again this must be analysed. My guess is that we will not be able to guess the final solution here today, so it is pointless to put to much of it in the design.

The thing is, if we don't make the TripOnDate a union in the schema, "There might be more stop-locations than in the scheduled plan" issue (or similar issues) might be difficult to solve. Making the calls a union/interface could lessen these types of issues but I don't know if even that is a solution for the aforementioned issue.

5. I would like to find a better name for `ArrivalDepartureTime`, but has not done it. I like to find good names for value-objects types - these are the easiest types to reuse and they do not need to change. If a change is need, the entity using it can be changed instead. Value-objects are also easy to use for the client, reusing logic to handle it.

I think we decided to call it CallTime.

6. Departure and arrival-times are not required. This is also a simplification we have done in the internal model witch has leaked out. I am not worried here, and it might be just much easier for the client if we say they are required. If needed we can add information about the origin [OTP generated/feed] of these fields instead.

I didn't make them required because I'm not sure if we always will have arrival time for the first stop or departure time for the last stop in the future.

7. I have focused on the "domain" not the "graph QL" schema - I have not included all fields, and some names are probably wrong.

In general, I think designing the domain is a good idea, however I think designing the schema is even more important since changing that is much more difficult.

8. Note! there is no interface for the `ArrivalDepartureTime`,  `ArrivalDepartureTimeWindow` and `EstimatedArrivalDepartureTime` - these types does not have anything in common. You could argue that there should be a union for the scheduled time/window - I would suggest it should be `ScheduledCallTime` - it belong to `Call` and should not be reused in other places, so call need to be part of the name.

I wasn't quite sure what is the best model for this. Should the scheduled/estimates be grouped together inside departure or arrival fields or should there be schedules and estimates which have departure and arrival fields. For the non-flex trips, grouping the estimates and schedules together feels more natural. However, with flex time windows it's not possible because there is no scheduled arrival or departure. That was the reason why I ended up using a union for the Calls so these could be modeled in different fashion.

@optionsome
Copy link
Member

We did some schema design with @t2gran, I'll update this pr to match the design later:

{
  calls {
    stopLocation = (Stop | StopArea | GroupOfStops) : CallStopLocation Union
    scheduled {
      plannedStatus = SKIPPED/SCHEDULED
      time { (TimeWindow | ArrivalDepartureTime) : CallScheduledTime
        timeWindow {
          start OffsetDateTime
          end OffsetDateTime
        }
        or
        {
          arrival OffsetDateTime
          departure OffsetDateTime
        }
      }
    }
    realTime CallRealTime  {
      cancelled = SKIPPED/SCHEDULED
      arrival EstimatedTime {
        time
        delay
        uncertainty percentage
        recorded RECORDED/ESTIMATE/UNKNOWN
      }
      departure {
        time
        delay
        isActualTime = false
      }
    }
  }
}

@optionsome
Copy link
Member

I've implemented new version of schema now. I'm not quite sure where some of the model classes needed by the API should be. They are now in the GTFS API package and elsewhere which is probably not ideal at least.

@optionsome
Copy link
Member

The schema design we did with @t2gran includes some things that I have not implemented in the scope of this pr as they are not needed for this.

}

"What is scheduled for a trip on a service date for a stop location."
type CallSchedule {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is the reason for wrapping this in another type?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can later include information if there is a planned cancellation, for example.

Copy link
Member

@leonardehrenfried leonardehrenfried left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I looked this for a while and cannot find any problems with it other than the question about the wrapped type.

I must admit, I have been reviewing this so many times, that I'm having trouble concentrating.

"Scheduled arrival and departure times for this stop location."
schedule: CallSchedule
"The stop where this arrival/departure happens."
stopLocation: CallStopLocation!
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is the naming convention you decided on, naming the union by the "aggregate" Call or the wrapping type StopCall. In this case CallStopLocation or StopCallStopLocation.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure if we have made a clear decision on this. It depends a bit on if we think there will be some other need for stop locations within the call scope or not. I think mostly we would have the same set of different types of stop locations for different use cases.

@t2gran
Copy link
Member

t2gran commented Dec 12, 2024

In general, I think designing the domain is a good idea, however I think designing the schema is even more important since changing that is much more difficult.

Just, so we are on the same page - the domain model is the conceptual model of the "real world" - if you do a mistake here, the schema will be wrong, and the code will be wrong - it it those mistakes that cost 1000 times more than other mistakes.

@optionsome
Copy link
Member

Just, so we are on the same page - the domain model is the conceptual model of the "real world" - if you do a mistake here, the schema will be wrong, and the code will be wrong - it it those mistakes that cost 1000 times more than other mistakes.

That is true but we can also make wrong schema decisions based on a valid domain model so we should have some focus on the schema design as well.

@optionsome optionsome requested a review from tkalvas December 12, 2024 13:56
@t2gran
Copy link
Member

t2gran commented Dec 13, 2024

✅ For the record I approve the schema.graphql - I have not looked into the code.

@tkalvas tkalvas merged commit 31c40f7 into opentripplanner:dev-2.x Dec 16, 2024
5 checks passed
t2gran pushed a commit that referenced this pull request Dec 16, 2024
@tkalvas tkalvas deleted the gtfsgraphql-cancelled-trips branch December 16, 2024 10:06
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
GTFS Related to import of GTFS data Improvement
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants