Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

reader.go exit if line 337 and does not process after runtime error: index out of range [x] with length x #595

Open
sbyrdsellGIT opened this issue Sep 4, 2024 · 0 comments

Comments

@sbyrdsellGIT
Copy link

github.com/xitongsys/parquet-go v1.6.2

###PROBLEM

I have a large complex structure with *struct, *string, *int64, *bool, *map[string]*struct, map[string][]*string
with 1 TB of records I need to process.

If I run

fr, err := local.NewLocalFileReader(parquetFile)
	if err != nil {
		log.Println("Can't open file: ", parquetFile)
		os.Exit(1)
}

pr, err := reader.NewParquetReader(fr, nil, 10) // NP-> 10 int64 parallel number

if err != nil {
       log.Println("Can't create parquet reader", err)
       return
}

stus := make([]myStruct, 10) //read 10 rows

if err = pr.Read(&stus); err != nil {
	log.Println(fmt.Sprintf("Read error %s", err))
}

If any of the 10 records error's in my case it's erroring in reader.go line 337 during the marshal.Unmarshal
i.e

if err2 := marshal.Unmarshal(&tmap, b, e, dstList[index], pr.SchemaHandler, prefixPath); err2 != nil {"

After erroring it returns runtime error: index out of range [x] with length x and doesn't send back any successful marshal.Unmarshal records. Causing the application to lose the 10 records.

WORK·A·ROUND

If I set
pr, err := reader.NewParquetReader(fr, nil, 1) // NP-> 1 int64 parallel number
and

stus := make([]myStruct, 1) //read 1 rows

if err = pr.Read(&stus); err != nil {
	log.Println(fmt.Sprintf("Read error %s", err))
}

then I only skip the 1 unprocessed marshal.Unmarshal but this make the process slow down x10.

Does anyone have any suggestions to help me with this error or speed up this process?

-Stan

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant