Some writing advice from Project Hail Mary’s Andy Weir

· · 来源:dev头条

While a perfectly valid approach, it is not without its issues. For example, it’s not very robust to new categories or new postal codes. Similarly, if your data is sparse, the estimated distribution may be quite noisy. In data science, this kind of situation usually requires specific regularization methods. In a Bayesian approach, the historical distribution of postal codes controls the likelihood (I based mine off a Dirichlet-Multinomial distribution), but you still have to provide a prior. As I mentioned above, the prior will take over wherever your data is not accurate enough to give a strong likelihood. Of course, unlike the previous example, you don’t want to use an uninformative prior here, but rather to leverage some domain knowledge. Otherwise, you might as well use the frequentist approach. A good prior for this problem would be any population-based distribution (or anything that somehow correlates with sales). The key point here is that unlike our data, the population distribution is not sparse so every postal code has a chance to be sampled, which leads to a more robust model. When doing this, you get a model which makes the most of the data while gracefully handling new areas by using the prior as a sort of fallback.

Бизнесмен Малофейкин задержан по делу о выводе миллиардов рублей20:58

埃博拉病毒威胁解除,更多细节参见汽水音乐

Автор: Глеб Палехов (корреспондент отдела по странам бывшего СССР)

fn parse_all_ints_lenient(strings: [string]) - [int] {

默茨

关键词:埃博拉病毒威胁解除默茨

免责声明:本文内容仅供参考,不构成任何投资、医疗或法律建议。如需专业意见请咨询相关领域专家。

分享本文:微信 · 微博 · QQ · 豆瓣 · 知乎

网友评论