Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

IndexError: list index out of range in open_data() #30

Open
break2make opened this issue Oct 23, 2022 · 37 comments
Open

IndexError: list index out of range in open_data() #30

break2make opened this issue Oct 23, 2022 · 37 comments

Comments

@break2make
Copy link

python nc_toolkit.py 
Please install CDO version 1.9.7 or above: https://code.mpimet.mpg.de/projects/cdo/ or https://anaconda.org/conda-forge/cdo
0.7.6
Traceback (most recent call last):
  File "x\gis_experiments\nc_toolkit.py", line 16, in <module>
    main()
  File "x\gis_experiments\nc_toolkit.py", line 7, in main     
    ds = nc.open_data("./data/tasmax_day_EC-Earth3_ssp245_r1i1p1f1_gr_2015.nc")
  File "x\gis_experiments\venv\lib\site-packages\nctoolkit\api.py", line 759, in open_data
    list1 = d.contents.reset_index(drop=True).data_type
  File "x\gis_experiments\venv\lib\site-packages\nctoolkit\api.py", line 1478, in contents
    return self.show_contents()
  File "x\gis_experiments\venv\lib\site-packages\nctoolkit\api.py", line 1315, in show_contents
    if out_inc[i]:
IndexError: list index out of range

I'm using Python 3.10 in Windows 10. Please help to resolve this issue.

@robertjwilson
Copy link
Member

Hi @break2make. As stated on the package website, nctoolkit will not work on Windows. Your options would be to either use Linux/macOS or use the Linux subsystem for Windows.

@agnesfrancois
Copy link

Hello, I have the same error in open_data() while working with Ubuntu (via Virtual Box), nctoolkit 0.8.6, CDO 2.1.1 and Python 3.10.8.
I used a vitual environment with which I first installed CDO last version, then nctoolkit 0.2.2 (via conda install -c conda-forge nctoolkit, I did not really get why such an old version was installed). No problem with open_data(), but I could not read my netcdf files correctly (empty .time(), empty .years() and IndexError: list index out of range for .plot()), so I updated nctoolkit with the most recent version, which gave me this error...

@robertjwilson
Copy link
Member

Hi @agnesfrancois

Can you provide the line of code that's giving the problem and the full python error?

The package is tested daily with CDO 2.1.1 and Python 3.10.8: https://app.circleci.com/pipelines/github/pmlmodelling/nctoolkit. And the tests are all passing at the minute. So it's most likely a versioning issue. Potentially conda is installing a very old version of a dependency that's incompatible in some way.

If possible, could you upload a .yml file with your conda environment and I can possibly see if I can reproduce the problem.

@agnesfrancois
Copy link

Yes, thank you for your really quick answer. My code (with screenshot attached) :
`import matplotlib as plt
import nctoolkit as nc
...
file = nc.open_data('SWIO12_CNRM-ESM2-1_HIST_r1i1p1f2_CNRM-ALADIN63_v2_frac_land_fx_once_REU_grid003.nc')


IndexError Traceback (most recent call last)
Cell In[8], line 1
----> 1 file = nc.open_data('SWIO12_CNRM-ESM2-1_HIST_r1i1p1f2_CNRM-ALADIN63_v2_frac_land_fx_once_REU_grid003.nc')

File ~/anaconda3/envs/cdo/lib/python3.10/site-packages/nctoolkit/api.py:755, in open_data(x, checks, **kwargs)
752 d._thredds = thredds
754 if (len(d) == 1) and checks and (thredds is False):
--> 755 d_contents = d.contents.reset_index(drop = True)
756 try:
757 d_sub = d_contents.query("fill_value == 0.0").reset_index(drop = True)

File ~/anaconda3/envs/cdo/lib/python3.10/site-packages/nctoolkit/api.py:1506, in DataSet.contents(self)
1499 @Property
1500 def contents(self):
1501 """
1502 Detailed list of variables contained in a dataset.
1503 This will only display the variables in the first file of an ensemble.
1504 """
-> 1506 return self.show_contents()

File ~/anaconda3/envs/cdo/lib/python3.10/site-packages/nctoolkit/api.py:1335, in DataSet.show_contents(self, n)
1333 i = 1
1334 while True:
-> 1335 if out_inc[i]:
1336 break
1337 i += 1

IndexError: list index out of range`

image

I don't see how to upload a .yml here (file type not supported) but here is my spec file
spec-file-export.txt

@robertjwilson
Copy link
Member

robertjwilson commented Jan 31, 2023

OK. Based on the output, nctoolkit's backend CDO does not like the file that much, @agnesfrancois . Is there anyway to share the file?

For now, you could try:

file = nc.open_data('SWIO12_CNRM-ESM2-1_HIST_r1i1p1f2_CNRM-ALADIN63_v2_frac_land_fx_once_REU_grid003.nc', checks = False)

This won't check the contents of the file when it is opened. The checking is currently throwing the error.

Can you try running this on the command line:

cdo sinfon SWIO12_CNRM-ESM2-1_HIST_r1i1p1f2_CNRM-ALADIN63_v2_frac_land_fx_once_REU_grid003.nc

and add the results on here? That will show if there is a fundamental problem with the data itself or if maybe I need to tweak something in nctoolkit to handle the file.

@agnesfrancois
Copy link

agnesfrancois commented Jan 31, 2023

Thank you for your answer. Unfortunately, I don't think I can share the file. Here are the results from the terminal :
image
I can also see with netCDF4 that the file was made with CDO v1.7.0 with convention CF-1.6 and file format HDF5.
(Indeed, with checks=False, I don't have the error anymore, but functions on the dataset don't work then)

@robertjwilson
Copy link
Member

OK. That's strange. Based on that, you shouldn't be getting the error message. It sounds like there is some issue calling CDO (using subprocess) in nctoolkit.

What errors are you getting when you try methods?

@agnesfrancois
Copy link

agnesfrancois commented Jan 31, 2023

I tried with several files I have, and when I try .variables, .years or .times results show an empty list (the other files have time-dependent data), when I try .spatial_mean() nothing is shown, when I try .mean() I have "AttributeError: 'DataSet object has no attribute 'mean'" and when I try to plot my files I have the following error (same IndexError if I try file.contents):
image

There might be something I don't get, I will try several things.

EDIT : it seems that when I convert my files with file.to_dataframe(), everything works well, so I will work with that if I don't manage to work with DataSets...

@robertjwilson
Copy link
Member

Can you try the following:

import nctoolkit as nc
ds = nc.open_thredds("https://psl.noaa.gov/thredds/dodsC/Datasets/COBE/sst.mon.mean.nc")
ds.subset(time = 0)
ds.plot()

If that doesn't work then there must be package mismatch.

Also, try:
nc.cdo_version()

That should show 2.1.1. If it does not then CDO is not actually visible to Python.

@agnesfrancois
Copy link

Indeed, there is a problem with both tests.
nc.cdo_version() does not show anything, and the four lines of code give me the following error :

The dataset has been reset to the starting point due to a run failure! Please change commands, where applicable, and re-run.

---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
File ~/anaconda3/envs/cdo/lib/python3.10/site-packages/nctoolkit/runthis.py:1024, in run_this(os_command, self, output, out_file, suppress)
   1022 else:
-> 1024     target = run_cdo(
   1025         ff_command, target, out_file, precision=self._precision
   1026     )
   1027     target_list.append(target)

File ~/anaconda3/envs/cdo/lib/python3.10/site-packages/nctoolkit/runthis.py:593, in run_cdo(command, target, out_file, overwrite, precision)
    592     remove_safe(target)
--> 593     raise ValueError(f"{command} was not successful. Check output")
    595 session_info["latest_size"] = os.path.getsize(target)

ValueError: cdo -L  -seltimestep,1 https://psl.noaa.gov/thredds/dodsC/Datasets/COBE/sst.mon.mean.nc /var/tmp/nctoolkitzmxhixgrnctoolkittmpnti50ji4.nc was not successful. Check output

During handling of the above exception, another exception occurred:

ValueError                                Traceback (most recent call last)
Cell In[15], line 4
      2 ds = nc.open_thredds("https://psl.noaa.gov/thredds/dodsC/Datasets/COBE/sst.mon.mean.nc")
      3 ds.subset(time = 0)
----> 4 ds.plot()

File ~/anaconda3/envs/cdo/lib/python3.10/site-packages/nctoolkit/plot.py:61, in plot(self, vars, autoscale, out, coast, **kwargs)
     58     kwargs["title"] = ""
     60 # run any commands
---> 61 self.run()
     63 if session_info["coast"]:
     64     coast = True

File ~/anaconda3/envs/cdo/lib/python3.10/site-packages/nctoolkit/run.py:38, in run(self)
     35 if self._merged:
     36     output_method = "one"
---> 38 run_this(cdo_command, self, output=output_method)
     40 self._merged = False
     42 self._execute = False

File ~/anaconda3/envs/cdo/lib/python3.10/site-packages/nctoolkit/runthis.py:1198, in run_this(os_command, self, output, out_file, suppress)
   1196 except Exception as e:
   1197     self.reset()
-> 1198     raise ValueError(e)

ValueError: cdo -L  -seltimestep,1 https://psl.noaa.gov/thredds/dodsC/Datasets/COBE/sst.mon.mean.nc /var/tmp/nctoolkitzmxhixgrnctoolkittmpnti50ji4.nc was not successful. Check output

@robertjwilson
Copy link
Member

OK. It sounds like something has gone wrong in your conda environment and CDO isn't accesssible from Python.

Try this:

import subprocess
subprocess.Popen("cdo --version", shell = True)

You should get something like the output below. If you don't then something has gone wrong in your conda environment.

Climate Data Operators version 2.1.1 (https://mpimet.mpg.de/cdo)
System: x86_64-conda-linux-gnu
CXX Compiler: /home/conda/feedstock_root/build_artifacts/cdo_1671208954590/_build_env/bin/x86_64-conda-linux-gnu-c++ -fPIC -DPIC -g -O2 -fopenmp -march=nocona -mtune=haswell -ftree-vectorize -fPIC -fstack-protector-strong -fno-plt -O2 -ffunction-sections -pipe -isystem /local1/data/scratch/rwi/mambaforge3/envs/nc/include -fdebug-prefix-map=/home/conda/feedstock_root/build_artifacts/cdo_1671208954590/work=/usr/local/src/conda/cdo-2.1.1 -fdebug-prefix-map=/local1/data/scratch/rwi/mambaforge3/envs/nc=/usr/local/src/conda-prefix -fopenmp -pthread
CXX version : unknown
C Compiler: /home/conda/feedstock_root/build_artifacts/cdo_1671208954590/_build_env/bin/x86_64-conda-linux-gnu-cc -march=nocona -mtune=haswell -ftree-vectorize -fPIC -fstack-protector-strong -fno-plt -O2 -ffunction-sections -pipe -isystem /local1/data/scratch/rwi/mambaforge3/envs/nc/include -fdebug-prefix-map=/home/conda/feedstock_root/build_artifacts/cdo_1671208954590/work=/usr/local/src/conda/cdo-2.1.1 -fdebug-prefix-map=/local1/data/scratch/rwi/mambaforge3/envs/nc=/usr/local/src/conda-prefix -fopenmp -pthread -pthread
C version : unknown
F77 Compiler: /home/conda/feedstock_root/build_artifacts/cdo_1671208954590/_build_env/bin/x86_64-conda-linux-gnu-gfortran -march=nocona -mtune=haswell -ftree-vectorize -fPIC -fstack-protector-strong -fno-plt -O2 -ffunction-sections -pipe -isystem /local1/data/scratch/rwi/mambaforge3/envs/nc/include -fdebug-prefix-map=/home/conda/feedstock_root/build_artifacts/cdo_1671208954590/work=/usr/local/src/conda/cdo-2.1.1 -fdebug-prefix-map=/local1/data/scratch/rwi/mambaforge3/envs/nc=/usr/local/src/conda-prefix
F77 version : GNU Fortran (conda-forge gcc 11.3.0-19) 11.3.0
Features: 31GB 12threads c++17 OpenMP45 Fortran pthreads HDF5 NC4/HDF5/threadsafe OPeNDAP udunits2 proj xml2 magics curl fftw3 sse3
Libraries: yac/2.6.1 HDF5/1.12.2 proj/9.1.1 xml2/2.10.3 curl/7.86.0 magics/4.12.1
CDI data types: SizeType=size_t
CDI file types: srv ext ieg grb1 grb2 nc1 nc2 nc4 nc4c nc5 nczarr
CDI library version : 2.1.1
cgribex library version : 2.0.2
ecCodes library version : 2.27.0
NetCDF library version : 4.8.1 of Oct 31 2022 22:17:45 $
HDF5 library version : 1.12.2 threadsafe
exse library version : 1.4.2
FILE library version : 1.9.1

@agnesfrancois
Copy link

OK thank you, I have the following output so something is wrong with my environment, I will try again and create a new one

image

@robertjwilson
Copy link
Member

That suggests the jupyter notebook you are using is not actually from the CDO environment.

Check by running this in the notebook

conda list

You should see CDO in the packages. Otherwise, the notebook package is probably outside the CDO environment.

@agnesfrancois
Copy link

Indeed, CDO is in the packages...

image

@robertjwilson
Copy link
Member

Very strange. Try this in the notebook:

! cdo --version

It's possible the Python version being used by the notebook does not actually come from the environment. Did you specify Python version when creating the environment?

@agnesfrancois
Copy link

I have the following output (in french sorry):
image

For the environment I was using until now, yes I had to specify the version. As I wanted to upgrade nctoolkit v0.2.2, I had to specify Python v3.10 as I first had Python v3.11. However, I just tried with a new environment, with conda install -c conda-forge nctoolkit as the only command (it was not working before), and I have the exact same outputs

@robertjwilson
Copy link
Member

OK that's strange. CDO is in the environment, but is not accessible from the notebook. Running ! cdo--version from a notebook should be the equivalent of running it from the terminal in the environment, as far as I understand it. Something strange must be going on with the environment.

@agnesfrancois
Copy link

Hello ! Problem solved after checking this page : https://code.mpimet.mpg.de/boards/1/topics/13131 , jupyterlab was not installed in my virtual environment, so the notebook could not really make a link

@robertjwilson
Copy link
Member

Thanks for confirming the problem @agnesfrancois. Always tricky to ensure everything you are using is in the environment

@fipoucat
Copy link

I have a similar trouble on Ubuntu and wonder if a solution is available?

import nctoolkit as nc
Traceback (most recent call last):
File "", line 1, in
File "/home/sarr/anaconda3/envs/CDO_environment/lib/python3.11/site-packages/nctoolkit/init.py", line 54, in
if valid(cdo_version) is False:
^^^^^^^^^^^^^^^^^^
File "/home/sarr/anaconda3/envs/CDO_environment/lib/python3.11/site-packages/nctoolkit/init.py", line 26, in valid
where = [m.start() for m in re.finditer(sub, string)][n - 1]
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~^^^^^^^
IndexError: list index out of range

@robertjwilson
Copy link
Member

Hi @fipoucat. Which version of nctoolkit and CDO are installed?

@fipoucat
Copy link

Hi, I am using nctoolkit 0.8.7 and cdo 2.1.1

@robertjwilson
Copy link
Member

Can you double check that you are actually using nctoolkit 0.8.7 in your environment @fipoucat ? It looks like it is picking up a much older version. You are getting an error at line 54 in the init.py file. But there isn't a line 54 : https://github.com/pmlmodelling/nctoolkit/blob/master/nctoolkit/__init__.py

This looks like nctoolkit version 0.2.X, which won't work with CDO 2.0.X because they changed how output was returned.

@fipoucat
Copy link

I'll try to uninstall it and reinstall and will update you

@fipoucat
Copy link

@robertjwilson, that's correct in the cdo env version 0.2.2 was installed so uninstalling it solved the issue.
Thank you for the quick support

@fipoucat
Copy link

I was too fast, when testing the example I am getting a strange error:
AttributeError: module 'nctoolkit' has no attribute 'open_data'

@robertjwilson
Copy link
Member

Please add the code that caused the issue. This doesn't sound like something that could happen, so it's probably an environment issue.

@fipoucat
Copy link

I am testing the example in the user guide:
import nctoolkit as nc

import numpy as mp

ds = nc.open_data("/home/sarr/work/READING_ASSESS/sst.mon.mean.nc")
Traceback (most recent call last):

Cell In[4], line 1
ds = nc.open_data("/home/sarr/work/READING_ASSESS/sst.mon.mean.nc")

AttributeError: module 'nctoolkit' has no attribute 'open_data'

@robertjwilson
Copy link
Member

robertjwilson commented Feb 17, 2023

I don't think there is anything within nctoolkit that could cause that too happen. Something must have gone wrong when installing or setting up environments.

What happens when you try to autocomplete after typing nc. in your notebook/ipython? Does anything show up?

@fipoucat
Copy link

I cleaned all installation and now working after reinstalled.

Thank you

@robertjwilson
Copy link
Member

Thanks @fipoucat

@fipoucat
Copy link

@robertjwilson ,
A follow up problem since my previous comment on my installation problem:
When opening my netcdf file with nc.open.data(output.nc) I have an error message:
The variable(s) z,lsm,cl have integer data type. Consider setting data type to float 'F64' or 'F32' using set_precision.
Any hint for a solution?
Thank you

@robertjwilson
Copy link
Member

This is normally not something to worry about. nctoolkit uses CDO which will preserve the netCDF data type when carrying out calculations. So if you have an integer data type, all calculations will use that data type.

However, you can change thedata type by doing

ds.set_precision("F32")

In general, data type doesn't matter too much. There are some cases where you need to be careful. For example, let's say you wanted to calculate the fraction of years when temperature in a dataset was above 10 C. You could do the following:

ds > 10
ds.tmean()

However, if the data type is integer, you will probably just have 0s and 1s in the dataset before tmean, and you can only end up with a 0 or 1 in the output because it is integer.

@denisthenichita
Copy link

Hello, i get the same error when trying to import nctoolkit.

`---------------------------------------------------------------------------
IndexError Traceback (most recent call last)
Cell In[129], line 1
----> 1 import nctoolkit as nc

File ~/miniconda3/lib/python3.10/site-packages/nctoolkit/init.py:54
52 cdo_check = cdo_check.replace("b'", "").strip()
53 cdo_version = cdo_check.split("(")[0].strip().split(" ")[-1]
---> 54 if valid(cdo_version) is False:
55 print(
56 "Please install CDO version 1.9.3 or above: https://code.mpimet.mpg.de/projects/cdo/ or https://anaconda.org/conda-forge/cdo"
57 )

File ~/miniconda3/lib/python3.10/site-packages/nctoolkit/init.py:26, in valid(string)
24 wanted = ""
25 n = 3
---> 26 where = [m.start() for m in re.finditer(sub, string)][n - 1]
28 string = re.sub("[A-Za-z]", "", string)
30 before = string[:where]

IndexError: list index out of range`

I have cdo version 2.0.3

@robertjwilson
Copy link
Member

Which version of nctoolkit are you using @denisthenichita? Based on the error message, you have an old version installed. This can happen when you install with conda, which can install version 0.2x instead of a recent version.

Old versions of nctoolkit won't work with CDO 2.0x because they changed how output was returned.

Try updating to nctoolkit 0.9.0. Also update CDO to 2.0.5 or above. nctoolkit won't work with CDO 2.0.3 because a bug in CDO was causing one of the nctoolkit tests to fail. This was fixed in 2.0.5.

@denisthenichita
Copy link

denisthenichita commented Mar 26, 2023

Thank you for the very quick response. conda update nctoolkit tells me # All requested packages already installed.. Indeed i have the 0.2 version of nctoolkit. I am sorry if this is a rookie mistake, i am relatively new to python. Working on wsl atm.

@robertjwilson
Copy link
Member

That version is almost 2.5 years old.

Just do this to get the latest:

conda install nctoolkit=0.9.0

My recommendation would be to use mambaforge instead of conda. You won't run into these issues with it as much: https://github.com/conda-forge/miniforge

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants