Issue opening jls2 file with Joulescope v0.9.11

Hi
I have recorded a jls file using pyjoulescope_examples\capture.py. I have recorded for 3930s. I have two recordings, one I can open, one I cannot open, see picture for output.

=====
QApplication: invalid style override 'adwaita' passed, ignoring it.
	Available styles: Windows, Fusion
WARNING:2022-05-04 09:46:17,385:raw.c:121:pyjls.c:file header length 0, not closed gracefully
ERROR:2022-05-04 09:46:17,386:main.py:859:joulescope_ui.main:while opening device
Traceback (most recent call last):
  File "/home/nitdrbill/miniconda3/lib/python3.10/site-packages/joulescope_ui/main.py", line 857, in _device_open
    self._device.open(self.resync_handler('device_event'))
  File "/home/nitdrbill/miniconda3/lib/python3.10/site-packages/joulescope_ui/recording_viewer_device_v2.py", line 527, in open
    self._open()
  File "/home/nitdrbill/miniconda3/lib/python3.10/site-packages/joulescope_ui/recording_viewer_device_v2.py", line 503, in _open
    self._reader = Reader(self._filename)
  File "pyjls\binding.pyx", line 370, in pyjls.binding.Reader.__init__
RuntimeError: open failed 9

What else can I provide/test, I expect a file inconsistency, how can I check if a file has a correct suffix/ending(not filename extension)?

Hi @lukGWF - sorry to hear that one of your JLS v2 files is having issues. The WARNING on the second line tells the story: file header length 0, not closed gracefully. This means that the capture exited without calling close on the JLS v2 file.

The JLS v2 file does contain all of the data. However, when not closed properly, the writer does not add the final indices. The last writer step updates the length, which makes it very easy to detect this condition. The reader is not yet able to perform recovery for this condition. See Reliability under Features. All the information is there in the JLS v2 file, we just have not yet written the software to make it happen.

For today, is it easy for you to rerun this capture? If not, I can try to implement a recovery method sooner rather than later, but it will still be at least a few days before something is ready.

Hi @mliberty - I had that quite a couple of times. So if you can priorize this I would be thankful. I’m going to use it in jls_plot.py.

Hi @lukGWF - I am concerned that you have seen this multiple times. Is your capture code crashing frequently?

If not, it is likely that the capture code is not always closing the JLS file. You can use the writer context manager, or try/finally in your code to ensure that everything closes when done. For example, check out the capture_jls_v2.py example.

If you would like, I am happy to review your capture code. You can attach it here or DM me privately.

You will likely not be happy with the repair process if you need to do it often. It will be an extra, manual step, and it may not be super fast, either.

Hi @mliberty
I just used the original capture.py with frequency 2000000, signals current.

Well, capture.py does use try/finally. When you recorded JLS files that were not closed correctly, did you notice that the capture.py script exited cleanly and correctly? For example, the computer rebooting due to Windows Update will not correctly close capture.py.

Ok. I will use the capture_jls_v2 for the upcoming.

I don’t think that you will see a difference between capture.py and capture_jls_v2.py. capture.py uses try/finally to ensure the close is called. capture_jls_v2.py uses a context manager. Both should close and finalize the JLS v2 file correctly as long as the program exits correctly.

When you recorded JLS files that were not closed correctly, did you notice that the capture.py script exited cleanly and correctly?

I can’t remember that something was special, so no, but I willl have an eiye on.

1 Like

Thanks! If the program exits normally and you get a JLS v2 file that does not load correctly, I definitely want to know. We have tested this pretty well, and we have not seen any issues so far. If you see issues on your machine, then we definitely have more work to do!

We still need to implement the JLS v2 repair tool, but I would much rather spend time now to make sure you are getting clean, correct captures.

One more thought… Are you running from the command line on Windows? If so, are you stopping the capture using either the --duration command line argument or CTRL-C? You can then manually check the return code, which should be 0, using:

echo %ERRORLEVEL%

If you are starting the process from another program, are you sending SIGINT to stop it and then waiting for it to close? Sending SIGTERM is bad…

I am using --duration to close.

Here is what I tried:

Microsoft Windows [Version 10.0.22000.613]
(c) Microsoft Corporation. All rights reserved.

C:\Users\Matth>python -VV
Python 3.9.10 (tags/v3.9.10:f2f3f53, Jan 17 2022, 15:14:21) [MSC v.1929 64 bit (AMD64)]

C:\Users\Matth>cd c:\repos\Jetperch\pyjoulescope_examples

c:\repos\Jetperch\pyjoulescope_examples>python bin\capture.py --jls 2 --duration 60 out.jls
Capturing data from Joulescope:001422: type CTRL-C to stop

I then opened out.jls in the Joulescope UI, and it opened normally, as expected.

I then performed a second capture where I intentionally disconnected the Joulescope JS110 USB cable during the capture:

c:\repos\Jetperch\pyjoulescope_examples>python bin\capture.py --jls 2 --duration 60 out.jls
Capturing data from Joulescope:001422: type CTRL-C to stop
endpoint halt 1: EndpointIn WinUsb_GetOverlappedResult fatal: [31] A device attached to the system is not functioning.
Device.stop() while attempting _stream_settings_send

c:\repos\Jetperch\pyjoulescope_examples>

I was able to open the JLS file in the Joulescope UI. The JLS file only contains a few seconds of data, but the program ran for the full 60 seconds. So, I could not duplicate the behavior you are seeing with this quick test.

Inspecting the code, I see that close mostly does not report errors. See close and jls_fsr_wr_close for examples. We can definitely improve this. However, none of these things should fail if your storage drive is still working.

Can you provide more details so that I can better investigate?

  1. What does python -VV display (that’s two “V”, not “W”)?
  2. What is your operating system?
  3. If you are on Windows, are you using python from python.org or some other distribution?
  4. If you run pip3 install -U joulescope pyjls numpy, does anything update?
  5. What are the exact capture.py arguments you are using?
  6. When you get a capture where you cannot open the recorded JLS v2 file, what is printed at the command line?

1>Python 3.9.7 (default, Sep 16 2021, 16:59:28) [MSC v.1916 64 bit (AMD64)]
2>Windows 10 21h1 (and ZorinOS16, Kernel 5.13.0-40, Python 3.10.4 (main, Mar 31 2022, 08:41:55) [GCC 7.5.0]]
3>Win: miniconda3, ZorinOS miniconda3
4> had not installed joulescope (used binary instead)
5> python bin\capture.py filename.jls --signals current --jls 2 --frequency 2000000 --duration 60
6> the original post, haven’t found more yet. Tested 6x 60s recording, 1x2700s recording, 1x60s recording with “Windows Lock Screen”
Testing a couple of long-term these-days so I will have a look and note them here.

2> Ok, I understand that you are using both Windows and ZorinOS (Ubuntu-based). Have you seen this issue occur on Windows, ZorinOS, or both?

I do not understand what you mean here. Could you be more specific? Did you build the pyjoulescope native code from the GitHub repo? Install with conda install? Something else?

I meant, “What does capture.py display when this failure occurs?” I believe that the original post was from when you were later trying to open the recorded JLS file with your jls_plot.py script.

4> I have downloaded the ui as binary and executed it so not from within the python env.

I believe that the capture did not close the JLS v2 file correctly, and I am looking to get more information about how you run pyjoulescope_examples\capture.py. I am not asking about the Joulescope UI. Did you do pip3 install joulescope or conda install joulescope?

What prints if you enter this at the command line:

python -c "import joulescope; import numpy; print(f'{joulescope.__version__} {numpy.__version__}')"

I see:

0.9.11 1.22.3

Hi @mliberty,
Here the last unsuccessful recording:

python capture.py --frequency 200000 --signals current --jls 2 --duration 90000 e:\20220602_test-lorawan_and_eco_half-daily_90000s.jls
Capturing data from Joulescope:004271: type CTRL-C to stop
process: stream_buffer is behind: 1055555298 + 2000000 < 1057873068
process: stream_buffer is behind: 1059573186 + 2000000 < 1063059228
process: stream_buffer is behind: 106413501726 + 2000000 < 106415981784
process: stream_buffer is behind: 106415981784 + 2000000 < 106419041064
process: stream_buffer is behind: 106419041064 + 2000000 < 106422484896

versions used:

python -c "import joulescope; import numpy; print(f'{joulescope.__version__} {numpy.__version__}')"
0.9.11 1.22.3

Hi @lukGWF and thank you for the additional information. That warning process: stream_buffer is behind means that samples are being lost because your computer decided not to service your Joulescope. Things like Windows update, antivirus scans, power management, and backups can cause your computer to stop doing other activities and exhibit this behavior.

The JLS v2 recording is supposed to handle lost samples, but perhaps it’s not. Do you recall seeing this same warning on any successful captures?