Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add Korean language #42

Merged
merged 1 commit into from
Apr 21, 2024
Merged

Add Korean language #42

merged 1 commit into from
Apr 21, 2024

Conversation

lens0021
Copy link
Contributor

@lens0021 lens0021 commented Apr 20, 2024

No description provided.

@lens0021 lens0021 force-pushed the patch-1 branch 3 times, most recently from aaa27bb to 9bdfb1f Compare April 20, 2024 09:18
Signed-off-by: Lens0021 / Leslie <lorentz0021@gmail.com>
Copy link
Member

@biodranik biodranik left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks! Did you test it?

@vng where else the language should be enabled?

@lens0021

This comment was marked as outdated.

@lens0021

This comment was marked as off-topic.

@lens0021
Copy link
Contributor Author

lens0021 commented Apr 21, 2024

Ok now I've tested this patch and the confirmed that the generated HTML files don't have the unwanted sections:

image

How I've parsed:

  1. Downloaded south-korea-latest.osm.pbf from https://download.geofabrik.de/asia/south-korea.html
  2. Downloaded kowiki-NS0-20240420-ENTERPRISE-HTML.json.tar.gz from https://dumps.wikimedia.org/other/enterprise_html/runs/20240420/
  3. $ rustc --version
    rustc 1.73.0 (cc66ad468 2023-10-03)
    $ cargo run --release --
    $ ls
    article_processing_config.json
    benches/
    build.rs
    Cargo.lock
    Cargo.toml
    download.sh*
    ko.tsv
    kowiki-NS0-20240420-ENTERPRISE-HTML.json.tar.gz
    lib.sh
    LICENSE
    README.md
    run.sh*
    south-korea-latest.osm.pbf
    src/
    target/
    tests/
  4. $ target/release/om-wikiparser get-tags south-korea-latest.osm.pbf > ko.tsv
    $  head -n 5 ko.tsv
    @id     @otype  @version        wikidata        wikipedia
    301775326       0       11      Q495739 en:Axe murder incident
    306111080       0       2       Q12621551       ko:판암 나들목
    309675985       0       4       Q16093933       ko:경주 나들목
    309947253       0       42      Q42147  en:Cheongju
    $ mkdir descriptions
    $ tar -xvzf kowiki-NS0-20240420-ENTERPRISE-HTML.json.tar.gz
    $ mkdir dumps    
    $ mv kowiki_namespace_0_* dumps
  5. cat dumps/kowiki_namespace_0_0.ndjson | target/release/om-wikiparser get-articles --osm-tags ko.tsv --write-new-qids new_qids.txt descriptions/

Copy link
Member

@biodranik biodranik left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks! Let's see if it works without any additional changes in the generator code. Feel free to also add other wiki sections which are not necessary for a quick overview of the place.

@biodranik biodranik merged commit b2eabf8 into organicmaps:main Apr 21, 2024
1 check passed
@newsch
Copy link
Collaborator

newsch commented Apr 21, 2024

@lens0021 thanks for the contribution and testing so thoroughly!
Did you have any trouble using it? Is there anything that would be helpful to add to the docs?

@lens0021
Copy link
Contributor Author

lens0021 commented Apr 21, 2024

Did you have any trouble using it? Is there anything that would be helpful to add to the docs?

At first, I didn't know how to run a maps build that was described at:

- Run a maps build with descriptions enabled to generate the `id_to_wikidata.csv` and `wiki_urls.txt` files.

So I tried the alternative way. But I didn't know what is pbf file. After reading https://wiki.openstreetmap.org/wiki/Planet.osm, I am still not sure there was the extract file for my country or not. Fortunately, I found it on the Google.

Oh, and I could not expect that the config file was read in the build time. I thought the CLI read it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants