-
Notifications
You must be signed in to change notification settings - Fork 27
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Allow csv unescaping for station names #81
Conversation
to get rid of the std::string allocations which are unnecessary in ~99% of the cases were nothing was escaped, I suppose a global string pool could be used that string_views could point to. |
Wouldn't it be sufficient to have If it would have a Having global state is something I would try to avoid as it can introduce unexpected problems in with multi-threading, etc. |
In this case here ( |
Wasn't it just a string_view pointing to the csv source data? |
True. You're right. I think I'll try to change the procedure a bit so data is directly stored to the |
I think another option would be to use |
|
Next step would be to update |
I just can't get pkg do create a state that I can work in right now. |
If you work directly on the nigiri repo (not in deps, but nigiri = root), pkg should not interfere. pkg in general only checks out a version if |
Otherwise, you can just replace |
After switching to generic_string, the code now crashes.
0x00005555560826a1 in cista::generic_string<char const*>::reset (this=0x656c6b) at /home/jonah/kde/src/motis/deps/cista/include/cista/containers/string.h:67
67 if (!h_.is_short_ && h_.ptr_ != nullptr && h_.self_allocated_) {
(gdb) bt
#0 0x00005555560826a1 in cista::generic_string<char const*>::reset (this=0x656c6b) at /home/jonah/kde/src/motis/deps/cista/include/cista/containers/string.h:67
#1 0x000055555570033c in cista::generic_string<char const*>::~generic_string (this=0x656c6b, __in_chrg=<optimized out>) at /home/jonah/kde/src/motis/deps/cista/include/cista/containers/string.h:31
#2 utl::csv_col<cista::generic_string<char const*>, utl::csv_col_name<&utl::const_str<(char)115, (char)116, (char)111, (char)112, (char)95, (char)110, (char)97, (char)109, (char)101, (char)0, (char)0, (char)0, (char)0, (char)0, (char)0, (char)0, (char)0, (char)0, (char)0, (char)0, (char)0, (char)0, (char)0, (char)0>::s> >::~csv_col (this=0x656c6b, __in_chrg=<optimized out>)
at /home/jonah/kde/src/motis/deps/utl/include/utl/parser/csv_range.h:23
#3 csv_stop::~csv_stop (this=0x7fffffffac90, __in_chrg=<optimized out>) at /home/jonah/kde/src/motis/deps/nigiri/src/loader/gtfs/stop.cc:177
#4 utl::csv_range<nigiri::loader::gtfs::read_stops(nigiri::source_idx_t, nigiri::timetable&, tz_map&, std::string_view, std::string_view, unsigned int)::csv_stop, utl::line_range<utl::buf_reader<utl::progress_tracker::update_fn()::<lambda(size_t)> > >, ','>::read_row (s=..., this=<error reading variable: Cannot access memory at address 0xfffffffffffffd13>)
at /home/jonah/kde/src/motis/deps/utl/include/utl/parser/csv_range.h:103
#5 utl::csv_range<nigiri::loader::gtfs::read_stops(nigiri::source_idx_t, nigiri::timetable&, tz_map&, std::string_view, std::string_view, unsigned int)::csv_stop, utl::line_range<utl::buf_reader<utl::progress_tracker::update_fn()::<lambda(size_t)> > >, ','>::begin (this=<error reading variable: Cannot access memory at address 0xfffffffffffffd13>) at /home/jonah/kde/src/motis/deps/utl/include/utl/parser/csv_range.h:109
#6 utl::operator|<utl::csv_range<nigiri::loader::gtfs::read_stops(nigiri::source_idx_t, nigiri::timetable&, tz_map&, std::string_view, std::string_view, unsigned int)::csv_stop, utl::line_range<utl::buf_reader<utl::progress_tracker::update_fn()::<lambda(size_t)> > >, ','> >(utl::csv_range<nigiri::loader::gtfs::read_stops(nigiri::source_idx_t, nigiri::timetable&, tz_map&, std::string_view, std::string_view, unsigned int)::csv_stop, utl::line_range<utl::buf_reader<utl::progress_tracker::update_fn()::<lambda(size_t)> > >, ','> &&, utl::for_each_t<nigiri::loader::gtfs::read_stops(nigiri::source_idx_t, nigiri::timetable&, tz_map&, std::string_view, std::string_view, unsigned int)::<lambda(nigiri::loader::gtfs::read_stops(nigiri::source_idx_t, nigiri::timetable&, tz_map&, std::string_view, std::string_view, unsigned int)::csv_stop&)> > &&) (
r=..., f=...) at /home/jonah/kde/src/motis/deps/utl/include/utl/pipes/for_each.h:15
#7 0x00005555568cad08 in nigiri::loader::gtfs::read_stops (src=..., src@entry=..., tt=..., timezones=..., stops_file_content=..., transfers_file_content="", link_stop_distance=<optimized out>)
at /home/jonah/kde/src/motis/deps/nigiri/src/loader/gtfs/stop.cc:193 Do you have an idea why that could happen? Without the move operator, switching to generic_string just worsens import performance a lot, but doesn't crash |
My mistake. I missed to return |
Thanks! |
It should finally work properly now :) |
I think it was good with the version we had before. The latest change with making |
With this change address sanitizer doesn't complain anymore: felixguendling@07f6ec7 |
What's the performance impact of the new solution? |
I ran it on the transitous data set, and it wasn't noticeably slower (I didn't have a previous number unfortunately). |
Okay, wasn't sure whether location really never lives longer than stop, but that simplifies things 👍 |
Looks good! :-) |
This will only start doing something once motis-project/utl#18 is merged
It is the minimum amount of changes to fix the issue I encountered, but should probably be done for more fields.