It's not a big data set that lends itself primarily to analysis, it's more like content. For example, a list of all US Presidents with a lot of metadata or text content fields about them collected/combined from different sources, cleaned, corrected, annotated, etc. (Pretend Wikipedia has only a subset of these fields and considers broadening them out of scope.)
As for Github, the data would still be under "my" account and I'm thinking about more of a platform that doesn't depend on one person. Maybe I would manage day to day version control in Github but I'd want to promote occasional releases to be more official and not reliant on my account.
What about Kaggle? Or GitHub?
It's not a big data set that lends itself primarily to analysis, it's more like content. For example, a list of all US Presidents with a lot of metadata or text content fields about them collected/combined from different sources, cleaned, corrected, annotated, etc. (Pretend Wikipedia has only a subset of these fields and considers broadening them out of scope.)
As for Github, the data would still be under "my" account and I'm thinking about more of a platform that doesn't depend on one person. Maybe I would manage day to day version control in Github but I'd want to promote occasional releases to be more official and not reliant on my account.