GSoC: Distributed error reporting #12489

akolson · 2024-07-25T19:04:25Z

Summary

This pr implements the distributed error reporting feature as part of the Google Summer of Code(GSoC) 2024 program. See the epic for details.

References

Closes #12214

Reviewer guidance

All tests associated with the implementation must run successfully

Testing checklist

Contributor has fully tested the PR manually
If there are any front-end changes, before/after screenshots are included
Critical user journeys are covered by Gherkin stories
Critical and brittle code paths are covered by unit tests

PR process

PR has the correct target branch and milestone
PR has 'needs review' or 'work-in-progress' label
If PR is ready for review, a reviewer has been added. (Don't use 'Assignees')
If this is an important user-facing change, PR or related issue has a 'changelog' label
If this includes an internal dependency change, a link to the diff is provided

Reviewer checklist

Automated test coverage is satisfactory
PR is fully functional
PR has been tested for accessibility regressions
External dependency files were updated if necessary (yarn and pip)
Documentation is updated
Contributor is in AUTHORS.md

Distributed error reporting: Setting up the database that stores all the errors

…ask2 Distributed error reporting: Create model to store captured Errors

…ask3 Distributed error reporting: Middleware to catch runtime exception in backend

…or()

…ask4 Distributed error reporting: Endpoint /api/errorreports/report to store frontend error

…d, remove mark_as_reported

…ave method

… method

…ask7 Distributed error reporting: Capture more information about the error and its environment

kolibri/core/errorreports/api.py

reruns migrations

remove installation_type and release_version\n move request_time_to_error to context\n remove sensitive info from the requests info\n only use traceback and error_msg to fingerprint an error_report

DER: Move request_time_to_error to context and remove sensitive info from the requests info

bjester

Main blockers: we should add more logic to remove parameters like passwords from request data, and we should have request timeouts configured on the report requests

bjester · 2024-10-08T17:51:06Z

kolibri/core/assets/src/api-resources/__tests__/errorReport.test.js

+      },
+    };
+
+    Resource.client = jest.fn();


Ideally, whenever you mock something in Python or JS, you want to ensure that the original implementation can be restored after the test is completed. That approach can keep tests from interfering with other tests, because of their use of mocks.

Since this is a direct replace of Resource.client, there isn't a way for it to be restored. So it would be better to use mock.spyOn or mock.replaceProperty here, and do so in the beforeEach. Then instead of clearAllMocks in afterEach (which only clears the mock state), I would suggest using restoreAllMocks as that would ensure any mocks are restored to what they should be (assuming the appropriate approach was used to create the mock in the first place).

bjester · 2024-10-08T17:55:15Z

kolibri/core/assets/src/utils/errorReportUtils.js

+          height: window.screen.height,
+          available_width: window.screen.availWidth,
+          available_height: window.screen.availHeight,
+        },


There was discussion about using the screen size breakpoints instead of the actual width and height. Is that the case, because it doesn't look like it? The reason is that it protects privacy. Specific sizes can be used to identify users, which reduces the anonymity of the data

Yes, we should definitely make this update. Although I am also noticing that this file shouldn't exist, because it has been moved into the plugin to make this behaviour pluggable.

bjester · 2024-10-08T17:56:28Z

kolibri/core/errorreports/middleware.py

+    request_headers.pop("Cookie", None)
+
+    request_get = dict(request.GET)
+    request_get.pop("token", None)


In addition, probably for POST, we should ensure passwords are not sent?

bjester · 2024-10-08T17:58:02Z

kolibri/core/errorreports/models.py

+                error_report.context = context
+
+        error_report.save()
+        logger.error("ErrorReports: Database updated.")


This feels more like info-type logging?

bjester · 2024-10-08T18:01:37Z

kolibri/core/analytics/tasks.py

-        ping_once(started, server=server)
+        pingback_id = ping_once(started, server=server)
+        if pingback_id:
+            ping_error_reports.enqueue(args=(server, pingback_id))


I see this is creating two different pathways that hinges on the pingback_id. In utils.py, there already exists logic dependent on if "id" in data:, which is the same condition here. It seems like this fits alongside the existing logic there.

bjester · 2024-10-08T18:07:07Z

kolibri/core/assets/src/core-app/index.js

+Vue.config.errorHandler = function (err, vm) {
+  logging.error(`Unexpected Error: ${err}`);
+  const error = new VueErrorReport(err, vm);
+  ErrorReportResource.report(error);
+};
+
+window.addEventListener('error', e => {
+  logging.error(`Unexpected Error: ${e.error}`);
+  const error = new JavascriptErrorReport(e);
+  ErrorReportResource.report(error);
+});
+
+window.addEventListener('unhandledrejection', event => {
+  event.preventDefault();
+  logging.error(`Unhandled Rejection: ${event.reason}`);


I know the unhandledrejection listener will prevent default logging of the error, so in regards to that and the other logging statements, I'm concerned whether these are suppressing necessary log information, i.e. a stack trace, that developers would need? If logging.error outputs a stack trace, that may not be the same trace as the error itself.

bjester · 2024-10-08T18:09:52Z

kolibri/core/errorreports/tasks.py

+            join_url(server, "/api/v1/errors/report/"),
+            data=errors_json,
+            headers={"Content-Type": "application/json"},
+        )


Since this is using raw python requests, lets ensure this has explicit timeouts configured, and ideally separate timeouts for connection vs request.

Make error reports into a plugin

thesujai and others added 30 commits June 6, 2024 01:27

add errors in additional_sqlite_databases

d814aa1

add errorreports database in aditional sqlite db

afe3ad9

create new errorreports app

5e10f53

add ErrorReports db router

abb63f3

change ellipsis to pass

0e01b17

remove unused ready

20e540c

Merge pull request #12250 from thesujai/distributed-error-reporting

e5f2b4c

Distributed error reporting: Setting up the database that stores all the errors

add ErrorReports model with and its class methods

8b0d8cf

add tests for model methods

1b50c66

add DEVELOPER_MODE = False on settings

f89c601

conditional check of dev mode during writing into database

e49b82d

pass>>>...

2114c4d

use getattr for accessing settings.DEVELOPER_MODE

678d0cd

Merge pull request #12255 from thesujai/distributed-error-reporting-t…

5af22b6

…ask2 Distributed error reporting: Create model to store captured Errors

Add middleware for handling runtime errors

2e1b852

Add test for error-report middleware

722efe7

Simplify calling insert_or_update_error and tests

4c62c27

put all the constants together in errorreports

a859358

move POSSIBLE_ERRORS to contants.py

fa40e38

improve testcase for middleware

b1c50bf

Merge pull request #12260 from thesujai/distributed-error-reporting-t…

c2207f5

…ask3 Distributed error reporting: Middleware to catch runtime exception in backend

add serializer ErrorReprotsSerializers:frontend data validation

388dd17

add API for frontend error report

26f0018

testcase for frontendreport view

b6de284

make error_from default to 'frontend'

c67851b

simplify API: remove conditioning before calling insert_or_update_err…

cc968b6

…or()

name changes

09e2af3

expect (AttributeError, Exception) while calling insert_or_update

d103ff8

test for anything other than AttributeError or Exception can be caught

00b7be1

Merge pull request #12261 from thesujai/distributed-error-reporting-t…

53ae2ad

…ask4 Distributed error reporting: Endpoint /api/errorreports/report to store frontend error

thesujai added 6 commits July 29, 2024 14:33

changes: single context instead of two, full version instead of parse…

dea1d32

…d, remove mark_as_reported

changes: importlib instead of pkg_resources and pass context to the s…

86159d0

…ave method

changes: modelserializer instead of regular, pass context to the save…

22cfce2

… method

use single context

15e34ba

add more screen info in schemas and remove default schemas

85cf9c0

add query_params and improve packages retreival

c5eb2ae

akolson changed the title ~~Distributed error reporting~~ GSoC: Distributed error reporting Jul 29, 2024

thesujai and others added 4 commits July 31, 2024 22:04

Add pingback_id and request_time

14ceceb

raise 400 instead of 500

ee7eb3e

Merge pull request #12382 from thesujai/distributed-error-reporting-t…

d8a713e

…ask7 Distributed error reporting: Capture more information about the error and its environment

[pre-commit.ci lite] apply automatic fixes

780d544

github-advanced-security bot found potential problems Aug 7, 2024

View reviewed changes

kolibri/core/errorreports/api.py Fixed Show fixed Hide fixed

akolson and others added 10 commits August 10, 2024 02:17

reruns migrations

4f104ef

reruns migrations

085cf88

reruns migrations

5b909bc

reruns migrations

fd9e925

Removes information exposure through exception

ac07e7a

Removes information exposure through exception

6939d0e

Merge pull request #12551 from akolson/Fixes-migrations-and-tests

aacc517

reruns migrations

refactor stuffs

b64c4c2

remove installation_type and release_version\n move request_time_to_error to context\n remove sensitive info from the requests info\n only use traceback and error_msg to fingerprint an error_report

use get_or_create() with defaults arg

340f714

Merge pull request #12660 from thesujai/distributed-error-reporting

73aed79

DER: Move request_time_to_error to context and remove sensitive info from the requests info

rtibbles assigned bjester and LianaHarris360 Sep 24, 2024

rtibbles added 2 commits September 25, 2024 18:30

Make error reporting pluggable.

8ab29d4

Add error capturing for tasks.

39d07ad

bjester requested changes Oct 8, 2024

View reviewed changes

rtibbles added 2 commits October 23, 2024 15:23

Refactor to standardize naming of core app and model.

f996888

Merge pull request #12681 from rtibbles/plugin_error_reports

0c82a59

Make error reports into a plugin

rtibbles added the DEV: Core JS API Changes related to, or to the Core JS API label Nov 5, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

GSoC: Distributed error reporting #12489

GSoC: Distributed error reporting #12489

akolson commented Jul 25, 2024 •

edited

Loading

bjester left a comment

bjester Oct 8, 2024

bjester Oct 8, 2024

rtibbles Nov 5, 2024

bjester Oct 8, 2024

bjester Oct 8, 2024

bjester Oct 8, 2024

bjester Oct 8, 2024

bjester Oct 8, 2024

GSoC: Distributed error reporting #12489

Are you sure you want to change the base?

GSoC: Distributed error reporting #12489

Conversation

akolson commented Jul 25, 2024 • edited Loading

Summary

References

Reviewer guidance

Testing checklist

PR process

Reviewer checklist

bjester left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

akolson commented Jul 25, 2024 •

edited

Loading