It would be extremely useful to be able to export our character lists. I really want to use AI to generate generate stories using my known characters
We definitely want to add export functionality as some point, and an API. For your use case, what pieces of information would you like? Is character/meaning(s)/pinyin sufficient?
Would also love this feature, great idea!
I would love the feature as well.
I am also using AI for learning and this would help a lot, as I could easily use only the learned characters and words.
For me the simplest way to implement this would be a generated plain webpage with no formatting containing all the characters and words already learned. Just a simple list of all the characters, the plain hanzi text list without any descriptions or comments in there, that would need to be manually removed when feeding AI.
Would be even better if these characters would be separated between the SRS levels by spaces, so it would be possible to copy and paste only the characters we learned more or less than others. It would help with training.
Yes, that would definitely be enough!
FYI you can do this already with a small amount of fiddling. If you go to the character curriculum page (for me it’s here: Simplified Chinese Character Mnemonics | HanziHero ) and then just select all the boxes up to the point you’ve learnt you can copy + paste it into a text document (e.g. ctrl+shift+v
into notepad to strip all the formatting) and you’ll get this:
yī 一 one
bù 不 no
le 了 did
rén 人 person
...
You can then use many different ways to extract the character out of each line if you don’t want the rest. I did it with regex, but an easier way is probably just importing your text document into google sheets and have it split columns on space
character. All the character will then be sitting neatly in the 2nd column for you to do what you want with.
Only takes a minute or two at most - the hardest bit is just scrolling far enough to highlight them all lol. It’s not a one button solution, but I’ve done it like 3 or 4 times now at different points and it’s fairly quick+painless.
Doing the same but would still like a simple csv export or something. Would speed up the process a lot. Especially this scrolling gets a bit annoying above 50 characters.
For an API, would it be something to be used in Python? I think if there’s an API being built out, I’d assume functionality for characters and words would be similar.
# character pull: optional meaning, pinyin
table = hzh_characters_api(meaning=True, pinyin=True)
# table would be something like
# {
# character:
# {
# meaning, # as a string
# pinyin # as a string
# }
# }
# word pull: optional meaning, pinyin, example sentences
table = hzh_words_api(meaning=True, pinyin=True, examples=True)
# table would be something like
# {
# character:
# {
# meaning, # as a string
# pinyin, # as a string
# [sentences] # as a list or set
# }
# }
# reading into pandas
pd.DataFrame.from_dict(table, orient='index')
I think in the mean time, or at least as a first step, we can add a simple CSV export functionality. This will solve the needs of 90% of cases. I’ll file a ticket in our backlog for this.
Hey folks, there’s now a route for you to get a CSV of all known items
Just add a .csv
to the end of the list url for any item type, e.g. /traditional/characters.csv
and it’ll give a preliminary CSV. If you ctrl+s on that page you can save it locally for your own processing. Let me know if you run into any issues
small very minor note, what @Phil said works but is case sensitive so “.csv” will work but “.CSV” will not