These last days I’ve reported to the admin a few posts as spam, so I’ve developed this small bash script to detect posible posts

INFO: You need to have httpie and jq installed. Also, an API-KEY is required

INFO: In fact, it’s more to practice httpie and jq filtering capabilities than a useful tool


latest=$(http api-key:$API_KEY accept:application/vnd.forem.api-v1+json  per_page==80)

filtered=$(jq '.[] | select(.reading_time_minutes==1 and .user.user_id > 4)' <<< "$latest")

echo Total Last articles $(jq -M -r '.id' <<< "$filtered" | wc -l)
echo '-----'

echo Number of authors $(jq -M -r '.user.user_id' <<< "$filtered" | uniq | wc -l)
echo '-----'

users=$(jq -M -r '.user | .user_id' <<< "$filtered" | uniq)

for user_id in $(echo "$users"); do

   strjoined_at=$(http GET "$user_id" api-key:$API_KEY accept:application/vnd.forem.api-v1+json | jq -r '.joined_at')

   joined_at=$(date --date="$strjoined_at" "+%Y-%m-%d")
   days=$((($(date +%s) - $(date -d "$joined_at" +%s))/86400))

   if (( ${days:-2} < 3 )); then
        echo "The $user_id user is suspect to be spam, see post:"
        jq --arg jq_user_id ${user_id} '.[] | select(.user.user_id == ($jq_user_id|tonumber)) | .url' <<< "$latest"
1 retrieve last articles (80 max)
2 filter by reading_time_minutes as spam usually are short post
3 extract uniques user_id
4 find user details for user_id
5 check if this account was recently created

Obviously not all articles that meet these conditions are spam. Lot of people (as me) write a hello-world just created the account so the script show the url, so I can read the post and decide if it’s spam or not.

For next version, I have time, I would like to include some kind of "IA" to automatically read the post and decide if the post is spam

