一千萬個為什麽

搜索

直接從CSV文件計算統計數據

我有一個CSV格式的事務日誌文件,我想用它來運行統計信息。該日誌包含以下字段:

date:  Time/date stamp
salesperson:  The username of the person who closed the sale
promo:  sum total of items in the sale that were promotions.
amount:  grand total of the sale

我想得到以下統計數據:

salesperson:  The username of the salesperson being analyzed.
minAmount:  The smallest grand total of this salesperson's transaction.
avgAmount:  The mean grand total..
maxAmount:  The largest grand total..
minPromo:  The smallest promo amount by the salesperson.
avgPromo:  The mean promo amount...

我很想建立一個數據庫結構,導入這個文件,編寫SQL,並提取統計數據。我不需要這些數據比這些數據更多。有更容易的方法嗎?我希望一些bash腳本可以讓這很容易。

最佳答案

TxtSushi does this:

tssql -table trans transactions.csv \
'select
    salesperson,
    min(as_real(amount)) as minAmount,
    avg(as_real(amount)) as avgAmount,
    max(as_real(amount)) as maxAmount,
    min(as_real(promo)) as minPromo,
    avg(as_real(promo)) as avgPromo
from trans
group by salesperson'

我有一堆示例腳本展示如何使用它。

編輯:修復語法

轉載註明原文: 直接從CSV文件計算統計數據