SpamAssassin Bootcamp (sa-learn) train BAYES
- jimimaseye
- Moderator
- Posts: 10060
- Joined: 2011-09-08 17:48
Re: SpamAssassin Bootcamp (sa-learn) train BAYES
Whilst your tinkering, is this any interest to you: viewtopic.php?f=20&t=28052 ?
5.7 on test.
SpamassassinForWindows 3.4.0 spamd service
AV: Clamwin + Clamd service + sanesecurity defs : https://www.hmailserver.com/forum/viewtopic.php?f=21&t=26829
SpamassassinForWindows 3.4.0 spamd service
AV: Clamwin + Clamd service + sanesecurity defs : https://www.hmailserver.com/forum/viewtopic.php?f=21&t=26829
Re: SpamAssassin Bootcamp (sa-learn) train BAYES
That's very cool but I already set up an account to which all spam including spam above the delete threshold gets sent. That way I can sort it for learning purposes because I know that no users are sorting. At least they're not sorting very in any consistent manner.jimimaseye wrote: ↑2018-08-27 11:34Whilst your tinkering, is this any interest to you: viewtopic.php?f=20&t=28052 ?
Re: SpamAssassin Bootcamp (sa-learn) train BAYES
I used sa-learn.0.6.1.rar and encountered the problem of copying mail (spam.cmd ham.cmd)
How to solve the problem with copying?
Code: Select all
C:\SpamAssassin\temp>COPY "C:\Program Files (x86)\hMailServer\Data\home.aln\spam
\4B\{4BEF5CA4-5DBD-4668-B8B2-3C8104BD65BB}.eml" C:\SpamAssassin\temp\ham\5585.em
l /Y
Системе не удается найти указанный путь.
Скопировано файлов: 0.
- jimimaseye
- Moderator
- Posts: 10060
- Joined: 2011-09-08 17:48
Re: SpamAssassin Bootcamp (sa-learn) train BAYES
rub.ak wrote: ↑2018-09-22 19:49I used sa-learn.0.6.1.rar and encountered the problem of copying mail (spam.cmd ham.cmd)
How to solve the problem with copying?Code: Select all
C:\SpamAssassin\temp>COPY "C:\Program Files (x86)\hMailServer\Data\home.aln\spam \4B\{4BEF5CA4-5DBD-4668-B8B2-3C8104BD65BB}.eml" C:\SpamAssassin\temp\ham\5585.em l /Y Системе не удается найти указанный путь. Скопировано файлов: 0.
What version of hns do you have? Can you run this and post the results: viewtopic.php?f=20&t=30914
5.7 on test.
SpamassassinForWindows 3.4.0 spamd service
AV: Clamwin + Clamd service + sanesecurity defs : https://www.hmailserver.com/forum/viewtopic.php?f=21&t=26829
SpamassassinForWindows 3.4.0 spamd service
AV: Clamwin + Clamd service + sanesecurity defs : https://www.hmailserver.com/forum/viewtopic.php?f=21&t=26829
Re: SpamAssassin Bootcamp (sa-learn) train BAYES
Thanks for the response.
I solved the problem myself, it was enough to create the folders / temp / spam, / temp / ham
- jimimaseye
- Moderator
- Posts: 10060
- Joined: 2011-09-08 17:48
Re: SpamAssassin Bootcamp (sa-learn) train BAYES
Ok
5.7 on test.
SpamassassinForWindows 3.4.0 spamd service
AV: Clamwin + Clamd service + sanesecurity defs : https://www.hmailserver.com/forum/viewtopic.php?f=21&t=26829
SpamassassinForWindows 3.4.0 spamd service
AV: Clamwin + Clamd service + sanesecurity defs : https://www.hmailserver.com/forum/viewtopic.php?f=21&t=26829
Re: SpamAssassin Bootcamp (sa-learn) train BAYES
Hello,SorenR wrote: ↑2014-08-09 13:31The success of SpamAssassin relies on a well trained Bayes database. There are many ways to train your Bayes database, this is my shot at doing it.
The idea came from my play/toy MailServer (Postfix, Dovecut & MailScanner) on my Synology DS209+II NAS. There it is basically a no-brainer to set up as everything works off of MailDirs.NB! wrote:How to obtain SpamAssassin and the installation and configuration of SpamAssassin is NOT described here! Search elsewhere on this forum to obtain this information.
So how to do ... on a hMailServer 5.4.2-B1964 system ...
1 -- Build a script using the COM api to find and extract relevant emails and what could be more natural than to assume INBOX is good and SPAM is bad. Also, if HAM end up in SPAM (or visa versa) you move it to the respective folder (INBOX or SPAM) and at the next scheduled run, the email will be classified differently. I execute this script using Windows Schedule at 04:00 in the morning when everyone is (supposed to be) at sleep.
2 -- A global rule in hMailServer to move emails tagged as SPAM into a SPAM folder. Setting this rule as a global rule will ensure that ALL users of hMailServer are covered. If the SPAM folder do not exist, it will be created by hMailServer automatically.
Rule name: sa-learnSome admins may want to monitor what is tagged as SPAM, like me, so I forward a copy to spam@my-domain.tld with this revised global rule.Code: Select all
Criteria -> Custom header field -> X-hMailServer-Spam = YES Action -> Move to IMAP folder -> IMAP folder = SPAM
Rule name: sa-learn (BigBrother version) :: Use AND"X-hMailServer-LoopCount" is used to prevent loops. By checking for value < 1 we make sure it is only run once. Thus SPAM will stay in the spam@my-domain.tld INBOX.Code: Select all
Criteria -> Custom header field -> X-hMailServer-Spam = YES Criteria -> Custom header field -> X-hMailServer-LoopCount < 1 Action -> Move to IMAP folder -> IMAP folder = SPAM Action -> Forward email -> To = spam@my-domain.tld
3 -- Now that we have both good and bad emails defined, we need to pull them off of the server. For that I have choosen VBScript to interact with the hMailServer COM API.
sa-learn.vbsFunctional Description: wrote: The VBScript (sa-learn.vbs) will work with ONE domain at present as I only have one domain. Using the COM API it will locate and process all account addresses for that domain - except those addresses listed in the exception list.
The script will do two passes, one for INBOX and one for SPAM, and generate two .cmd files (HAMCopy.cmd & SPAMCopy.cmd) to be run by the script.
During the two passes, the number of messages in the respective folders are checked and only the last (max.) 20 messages are processed. This procedure is based on the assumption that the COM API will return data sorted by table ID and from examining the database it appears that hMailServer simply adds new message ID's to the table in favor of reusing deleted message ID's.
HAMCopy.cmd and SPAMCopy.cmd simply copies the selected .eml files to a HAM or SPAM directory.
A third .cmd file (sa-learn.cmd) is also executed by the script and this .cmd file contains the commands to execute sa-learn --spam, sa-learn --ham, sa-learn --sync and sa-learn --backup as it is customary on Unix type systems.
On a Dual-Core 3GHz, 4GB RAM, SATA, System w/ Windows Server 2003R2 it took almost 20 minutes to process ~4.200 HAM and ~5.600 SPAM emails.sa-learn.cmdCode: Select all
Option Explicit ' ' Version 0.1.0 09-08-2014, Soren Rathje - Initial version. ' Dim hmAdmin, hmPassword, hmDomain, hmSPAMFolder, hmSPAMDir, hmHAMFolder, hmHAMDir, hmExcludeAddress Dim i, j, s, objApp, objDomain, objAccount, objIMAPFolder, objMessage Dim fsoSPAM, fsoHAM, fsoSALearn, objFSO, objSPAM, objHAM, objShell ' ' Configuration parameters - BEGIN ' hmAdmin = "Administrator" ' hMailServer Administrator user hmPassword = "********" ' hMailServer Administrator password hmDomain = "my-domain.tld" ' Domain name hmSPAMFolder = "SPAM" ' SPAM IMAP folder hmSPAMDir = "C:\hMailServer\SPAM" ' You need to create this directory! hmHAMFolder = "INBOX" ' HAM IMAP Folder hmHAMDir = "C:\hMailServer\HAM" ' You need to create this directory! hmExcludeAddress = "spam@my-domain.tld, surveillance@my-domain.tld" fsoSPAM = "C:\hMailServer\Events\SPAMCopy.cmd" fsoHAM = "C:\hMailServer\Events\HAMCopy.cmd" fsoSALearn = "C:\hMailServer\Events\sa-learn.cmd" ' ' Configuration parameters - END ' Set objShell = WScript.CreateObject("WScript.Shell") Set objFSO = CreateObject("Scripting.FileSystemObject") Set objApp = CreateObject("hMailServer.Application") Call objApp.Authenticate(hmAdmin, hmPassword) Set objDomain = objApp.Domains.ItemByName(hmDomain) ' ' Find SPAM messages ' Set objSPAM = objFSO.CreateTextFile(fsoSPAM,True) For i = 0 to objDomain.Accounts.Count -1 Set objAccount = objDomain.Accounts.Item(i) ' DO NOT process excluded and non-active accounts. If (NOT InStr(hmExcludeAddress, objAccount.Address)) * objAccount.Active Then Set objIMAPFolder = objAccount.IMAPFolders.ItemByName(hmSPAMFolder) ' If no messages - skip If objIMAPFolder.Messages.Count > 0 Then s = 0 If objIMAPFolder.Messages.Count - 20 > 0 Then s = objIMAPFolder.Messages.Count - 20 For j = s to objIMAPFolder.Messages.Count -1 Set objMessage = objIMAPFolder.Messages.Item(j) objSPAM.Write "COPY " & objMessage.FileName & " " & hmSPAMDir & " /Y" & vbCrLf Next End If End If Next objSPAM.Close ' ' Find HAM messages ' Set objHAM = objFSO.CreateTextFile(fsoHAM,True) For i = 0 to objDomain.Accounts.Count -1 Set objAccount = objDomain.Accounts.Item(i) ' DO NOT process excluded and non-active accounts. If (NOT InStr(hmExcludeAddress, objAccount.Address)) * objAccount.Active Then Set objIMAPFolder = objAccount.IMAPFolders.ItemByName(hmHAMFolder) ' If no messages - skip If objIMAPFolder.Messages.Count > 0 Then s = 0 If objIMAPFolder.Messages.Count - 20 > 0 Then s = objIMAPFolder.Messages.Count - 20 For j = s to objIMAPFolder.Messages.Count -1 Set objMessage = objIMAPFolder.Messages.Item(j) objHAM.Write "COPY " & objMessage.FileName & " " & hmHAMDir & " /Y" & vbCrLf Next End If End If Next objHAM.Close ' ' Execute file copy and sa-learn.exe - sequentially - no StdOut. ' objShell.Run fsoSPAM, 0, true objShell.Run fsoHAM, 0, true objShell.Run fsoSALearn, 0, true
Code: Select all
C:\SpamAssassin\sa-learn.exe --siteconfigpath="C:\SpamAssassin\etc\spamassassin" --dbpath "C:\Documents and Settings\Default User\.spamassassin\bayes" --spam "C:\hMailServer\SPAM\*.eml" C:\SpamAssassin\sa-learn.exe --siteconfigpath="C:\SpamAssassin\etc\spamassassin" --dbpath "C:\Documents and Settings\Default User\.spamassassin\bayes" --ham "C:\hMailServer\HAM\*.eml" C:\SpamAssassin\sa-learn.exe --siteconfigpath="C:\SpamAssassin\etc\spamassassin" --dbpath "C:\Documents and Settings\Default User\.spamassassin\bayes" --sync C:\SpamAssassin\sa-learn.exe --siteconfigpath="C:\SpamAssassin\etc\spamassassin" --dbpath "C:\Documents and Settings\Default User\.spamassassin\bayes" --backup > "C:\Documents and Settings\Default User\.spamassassin\bayes_backup" REM DELETE C:\hMailServer\SPAM\*.eml /Q REM DELETE C:\hMailServer\HAM\*.eml /Q
Disclaimer: wrote: I take no responsibility for what you may or may not do with the above script/shell script. It works for me - it may not work for you. I DO NOT GUARANTEE THIS CODE TO BE BUG-FREE! USE AT YOUR OWN RISK! WHATEVER YOU DO - IT'S NOT MY FAULT!
AND remember; Real men do NOT backup - but they CRY a lot!
Please feel free to adopt and modify.
I want to use this to my existing hmailserver installation, however I am a bit confused how to use and implement the script. Is there any step-by-step instruction (similar to what Jimimaseye did to this post https://www.hmailserver.com/forum/viewt ... 91#p174991 ) where to put these scripts? Which folder, etc.
Thank you.
Re: SpamAssassin Bootcamp (sa-learn) train BAYES
Did you read the whole thread? The code in the very first post will probably not even work anymore after 5 years of Microsoft Updates. I do not have admin privs. on this board so code changes are posted as they are made.ashtec014 wrote: ↑2019-11-28 11:55Hello,SorenR wrote: ↑2014-08-09 13:31The success of SpamAssassin relies on a well trained Bayes database. There are many ways to train your Bayes database, this is my shot at doing it.
The idea came from my play/toy MailServer (Postfix, Dovecut & MailScanner) on my Synology DS209+II NAS. There it is basically a no-brainer to set up as everything works off of MailDirs.NB! wrote:How to obtain SpamAssassin and the installation and configuration of SpamAssassin is NOT described here! Search elsewhere on this forum to obtain this information.
So how to do ... on a hMailServer 5.4.2-B1964 system ...
1 -- Build a script using the COM api to find and extract relevant emails and what could be more natural than to assume INBOX is good and SPAM is bad. Also, if HAM end up in SPAM (or visa versa) you move it to the respective folder (INBOX or SPAM) and at the next scheduled run, the email will be classified differently. I execute this script using Windows Schedule at 04:00 in the morning when everyone is (supposed to be) at sleep.
2 -- A global rule in hMailServer to move emails tagged as SPAM into a SPAM folder. Setting this rule as a global rule will ensure that ALL users of hMailServer are covered. If the SPAM folder do not exist, it will be created by hMailServer automatically.
bla bla bla
bla bla bla
Disclaimer: wrote: I take no responsibility for what you may or may not do with the above script/shell script. It works for me - it may not work for you. I DO NOT GUARANTEE THIS CODE TO BE BUG-FREE! USE AT YOUR OWN RISK! WHATEVER YOU DO - IT'S NOT MY FAULT!
AND remember; Real men do NOT backup - but they CRY a lot!
Please feel free to adopt and modify.
I want to use this to my existing hmailserver installation, however I am a bit confused how to use and implement the script. Is there any step-by-step instruction (similar to what Jimimaseye did to this post https://www.hmailserver.com/forum/viewt ... 91#p174991 ) where to put these scripts? Which folder, etc.
Thank you.
Once you have read everything you will realize that it does not really matter where you put the scripts, they have hardcoded paths in them. I have sa-learn.vbs and sa-learn.cmd in my c:\spamassassin directory.
SørenR.
Woke is Marxism advancing through Maoist cultural revolution.
Woke is Marxism advancing through Maoist cultural revolution.
Re: SpamAssassin Bootcamp (sa-learn) train BAYES
Hi, I've managed to use and run this version and I got no error, however I am not sure if I configured it right.SorenR wrote: ↑2018-07-28 13:43Ok... Two steps forward and one step back...
I suspect there continues to be an issue with curly brackets.
Version 0.6.1 is back to creating HAM.CMD and SPAM.CMD that will copy mails to c:\spamassassin\temp\ham and c:\spamassassin\temp\spam. SA-LEARN.CMD does the actual learning.
DO NOT forget to delete the HAM and SPAM files in .\temp or you WILL get the permission issue. You need to recreate the HAM and SPAM folders in .\temp.
My logs shows no data:
Code: Select all
Wed 12/11/2019 18:29:51.94 - START
HAM:
SPAM:
Wed 12/11/2019 18:29:51.95 - STOP
Wed 12/11/2019 18:43:48.84 - START
HAM:
SPAM:
Wed 12/11/2019 18:46:03.29 - STOP
Wed 12/11/2019 18:46:34.55 - START
HAM:
SPAM:
Wed 12/11/2019 18:47:41.52 - STOP
Wed 12/11/2019 18:52:41.20 - START
HAM:
SPAM:
Wed 12/11/2019 18:52:41.20 - STOP
Wed 12/11/2019 18:55:55.38 - START
HAM:
SPAM:
Wed 12/11/2019 18:55:55.38 - STOP
I'm worried that if I set the time like 04:00am this doesn't automate because I need to click the 'ok' button to proceed. Is there any script where to automatically run the sa-learn.vbs?
My 'temp' folder has a data but the 'bayes_db' folder has no data.
And, do I need to enabled any of these inside my local.cf or just leave it as it is?
Code: Select all
# Use Bayesian classifier (default: 1)
#
# use_bayes 1
# Bayesian classifier auto-learning (default: 1)
#
# bayes_auto_learn 1
# Set headers which may provide inappropriate cues to the Bayesian
# classifier
#
# bayes_ignore_header X-Bogosity
# bayes_ignore_header X-Spam-Flag
# bayes_ignore_header X-Spam-Status
Re: SpamAssassin Bootcamp (sa-learn) train BAYES
Re the popup:
You have it set to 1, I believe.
Code: Select all
Const Verbose = 0
Re: SpamAssassin Bootcamp (sa-learn) train BAYES
Try this instead. Make sure to change bayes_path to the correct folder. SA windows user account needs access permission to the folder, so don't put it somewhere that SA cannot be read/write. Restart required.ashtec014 wrote: ↑2019-12-11 18:24And, do I need to enabled any of these inside my local.cf or just leave it as it is?Code: Select all
# Use Bayesian classifier (default: 1) # # use_bayes 1 # Bayesian classifier auto-learning (default: 1) # # bayes_auto_learn 1 # Set headers which may provide inappropriate cues to the Bayesian # classifier # # bayes_ignore_header X-Bogosity # bayes_ignore_header X-Spam-Flag # bayes_ignore_header X-Spam-Status
Code: Select all
use_bayes 1
bayes_path X:\sa-learn\.spamassassin\bayes
bayes_auto_learn 0
# bayes_ignore_header X-Bogosity
bayes_ignore_header X-Spam-Flag
bayes_ignore_header X-Spam-Status
Last edited by palinka on 2019-12-11 19:34, edited 1 time in total.
Re: SpamAssassin Bootcamp (sa-learn) train BAYES
Also....
https://spamassassin.apache.org/full/3. ... _Conf.html
https://spamassassin.apache.org/full/3. ... _Conf.html
You can change the minimum if you need to, but I think it will work better with the default minimum of 200. I trust these guys to know what's best.bayes_min_ham_num (Default: 200)
bayes_min_spam_num (Default: 200)
To be accurate, the Bayes system does not activate until a certain number of ham (non-spam) and spam have been learned. The default is 200 of each ham and spam, but you can tune these up or down with these two settings.
Re: SpamAssassin Bootcamp (sa-learn) train BAYES
Hi Palinka, I changed this topalinka wrote: ↑2019-12-11 19:18Re the popup:
You have it set to 1, I believe.Code: Select all
Const Verbose = 0
Code: Select all
Const Verbose = 1
Re: SpamAssassin Bootcamp (sa-learn) train BAYES
Regarding this one:palinka wrote: ↑2019-12-11 19:32Also....
https://spamassassin.apache.org/full/3. ... _Conf.html
You can change the minimum if you need to, but I think it will work better with the default minimum of 200. I trust these guys to know what's best.bayes_min_ham_num (Default: 200)
bayes_min_spam_num (Default: 200)
To be accurate, the Bayes system does not activate until a certain number of ham (non-spam) and spam have been learned. The default is 200 of each ham and spam, but you can tune these up or down with these two settings.
Do I need to add this as well to local.cf or where do I find this?bayes_min_ham_num (Default: 200)
bayes_min_spam_num (Default: 200)
Re: SpamAssassin Bootcamp (sa-learn) train BAYES
Should be 0 for no popup.ashtec014 wrote: ↑2019-12-12 09:16Hi Palinka, I changed this topalinka wrote: ↑2019-12-11 19:18Re the popup:
You have it set to 1, I believe.Code: Select all
Const Verbose = 0
but still I am getting the pop after running the task scheduler.Code: Select all
Const Verbose = 1
Re: SpamAssassin Bootcamp (sa-learn) train BAYES
You don't need to add it. But you will need 200 spams and 200 hams for bayes to work. If you don't have that many and can't wait, you can add those lines to local.cf and lower the numbers. But I recommend waiting for 200.ashtec014 wrote: ↑2019-12-12 09:18Regarding this one:palinka wrote: ↑2019-12-11 19:32Also....
https://spamassassin.apache.org/full/3. ... _Conf.html
You can change the minimum if you need to, but I think it will work better with the default minimum of 200. I trust these guys to know what's best.bayes_min_ham_num (Default: 200)
bayes_min_spam_num (Default: 200)
To be accurate, the Bayes system does not activate until a certain number of ham (non-spam) and spam have been learned. The default is 200 of each ham and spam, but you can tune these up or down with these two settings.Do I need to add this as well to local.cf or where do I find this?bayes_min_ham_num (Default: 200)
bayes_min_spam_num (Default: 200)
Re: SpamAssassin Bootcamp (sa-learn) train BAYES
Pop-up is still exist after reverting it back to 0.palinka wrote: ↑2019-12-12 13:47Should be 0 for no popup.ashtec014 wrote: ↑2019-12-12 09:16Hi Palinka, I changed this topalinka wrote: ↑2019-12-11 19:18Re the popup:
You have it set to 1, I believe.Code: Select all
Const Verbose = 0
but still I am getting the pop after running the task scheduler.Code: Select all
Const Verbose = 1
Re: SpamAssassin Bootcamp (sa-learn) train BAYES
Thank you this is noted. So far spams are not yet 200 but hams I'm sure its already more than 200 but I'm gonna wait as recommended.palinka wrote: ↑2019-12-12 13:50You don't need to add it. But you will need 200 spams and 200 hams for bayes to work. If you don't have that many and can't wait, you can add those lines to local.cf and lower the numbers. But I recommend waiting for 200.ashtec014 wrote: ↑2019-12-12 09:18Regarding this one:palinka wrote: ↑2019-12-11 19:32Also....
https://spamassassin.apache.org/full/3. ... _Conf.html
You can change the minimum if you need to, but I think it will work better with the default minimum of 200. I trust these guys to know what's best.Do I need to add this as well to local.cf or where do I find this?bayes_min_ham_num (Default: 200)
bayes_min_spam_num (Default: 200)
Re: SpamAssassin Bootcamp (sa-learn) train BAYES
Are call it with cscript?ashtec014 wrote: ↑2019-12-12 14:43Pop-up is still exist after reverting it back to 0.palinka wrote: ↑2019-12-12 13:47Should be 0 for no popup.ashtec014 wrote: ↑2019-12-12 09:16
Hi Palinka, I changed this tobut still I am getting the pop after running the task scheduler.Code: Select all
Const Verbose = 1
If you use wscript you get popups.
HMS 5.6.8 B2534.28 on Windows Server 2019 Core VM.
HMS 5.6.9 B2641.67 on Windows Server 2016 Core VM.
HMS 5.6.9 B2641.67 on Windows Server 2016 Core VM.
Re: SpamAssassin Bootcamp (sa-learn) train BAYES
Unfortunately no script. I just normally called it thru windows scheduler. I have no idea how to use cscript. Can you please help me how? Thank you
Re: SpamAssassin Bootcamp (sa-learn) train BAYES
Here's what I have. I only changed the password and domain names.
As you can see, by setting "Const Verbose = 0" it will bypass lines like:
If that's not happening for you, then you can delete all the verbose if statements. But you probably have some silly mistake like your scheduled task is using a different file than the one you're editing, or you didn't save it, or something dumb like that.
Code: Select all
Option Explicit
'
' Version 0.6.1 28/07-2018, Soren Rathje - Curley brackets still acting up, reverting to copying mails.
' Version 0.6.0 26/07-2018, Soren Rathje - Experimental support for both sa-learn AND spamc -L.
' Version 0.5.a 26/07-2018, Soren Rathje - Experimental rewrite of filelist to fix curly brace problem in sa-learn.
' Version 0.5.0 25/07-2018, Soren Rathje - Multiple domains, skipping non-existing folders plus reworked code.
' Version 0.4.3 30/05-2018, Soren Rathje - Introduced two special folders; non-delivered SPAM and False Positives.
' Version 0.4.2 28/05-2016, Soren Rathje - Compatibility issues (curly brackets bug in sa-learn).
' Version 0.4.1 25/11-2014, Soren Rathje - Compatibility issues.
' Version 0.4.0 27/10-2014, Soren Rathje - Changed error logging.
' Version 0.3.0 11/10-2014, Soren Rathje - Bugfixing & Log error to Eventlog if IMAPFolder is missing
' Version 0.2.0 30/08-2014, Soren Rathje - Selection changed to DAYS.
' Version 0.1.0 09-08-2014, Soren Rathje - Initial version.
'
' Configuration parameters
'
' Administrative and automation accounts can be excluded from processing
' by defining them in "ExcludeList"
'
Const Administrator = "Administrator"
Const Secret = "supersecretpassword"
Const ExcludeList = "user@mydomain.tld, user@anotherdomain.tld"
Const DomainList = "mydomain.tld, anotherdomain.tld, thirddomain.tld"
Const SPAMFolders = "SPAM, Junk, Junk E-mail"
Const HAMFolders = "INBOX, HAM"
Const Batch = 1 ' 0 = spamc, 1 = sa-learn
Const SALearn = "X:\sa-learn\sa-learn.cmd" ' Intermediate commandfile
Const TempDir = "X:\sa-learn\temp\" ' Need permission for create, read & write
Const LogDir = "C:\SpamAssassin\logs\" ' Need permission for create, read & write
Const BayesDir = "X:\sa-learn\.spamassassin\" ' Need permission for create, read & write
Const RetainDays = 7
Const Verbose = 0
Sub BuildList(a, mExcludes, b, mDays, mTemp, mType)
Dim i, j, k, l, strFileName
Dim oFile, oDomain, oAccount, oMessage, oMessages
Dim mDomain, mDomains, mFolder, mFolders
If Batch Then Set oFile = oFSO.CreateTextFile(mTemp & mType & ".cmd",True)
mDomains = Split(a, ",")
mFolders = Split(b, ",")
If Verbose Then WScript.Echo "Type: " & mType
For Each mDomain In mDomains
mDomain = Trim(mDomain)
Set oDomain = oApp.Domains.ItemByName(mDomain)
If Verbose Then WScript.Echo " Domain: " & oDomain.Name
For i = 0 To oDomain.Accounts.Count - 1
Set oAccount = oDomain.Accounts.Item(i)
If InStr(mExcludes, oAccount.Address) = 0 And oAccount.Active Then
If Verbose Then WScript.Echo " Account: " & oAccount.Address
For Each mFolder In mFolders
mFolder = Trim(mFolder)
On Error Resume Next
If oAccount.IMAPFolders.ItemByName(mFolder) Is Nothing Then
On Error Goto 0
Else
On Error Goto 0
If Verbose Then WScript.Echo " Folder: " & mFolder
Set oMessages = oAccount.IMAPFolders.ItemByName(mFolder).Messages
If Not IsNull(oMessages) Then
For j = 0 To oMessages.Count - 1
Set oMessage = oMessages.Item(j)
If oMessage.InternalDate > CDate(Now - mDays) Then
If Batch Then
oFile.Write "COPY " & Chr(34) & oMessage.FileName & Chr(34) & " " & mTemp & mType & "\" & CLng(oMessage.ID) & ".eml /Y" & vbCrLf
Else
oCMD.Run "cmd.exe /C spamc.exe -d " & SAHost & " -p " & SAPort & " -L " & mType & " < " & Chr(34) & oMessage.FileName & Chr(34), Verbose, True
End If
End If
Next
End If
End If
Next
End If
Next
Next
If Batch Then oFile.Close
End Sub
Dim oFSO : Set oFSO = CreateObject("Scripting.FileSystemObject")
Dim oApp : Set oApp = CreateObject("hMailServer.Application")
Dim oCMD : Set oCMD = CreateObject("WScript.Shell")
Call oApp.Authenticate(Administrator, Secret)
Dim SAHost : SAHost = oApp.Settings.AntiSpam.SpamAssassinHost
Dim SAPort : SAPort = oApp.Settings.AntiSpam.SpamAssassinPort
'
' Find HAM messages
'
If Verbose Then WScript.Echo "Processing HAM mails"
Call BuildList(DomainList, ExcludeList, HAMFolders, RetainDays, TempDir, "ham")
'
' Find SPAM messages
'
If Verbose Then WScript.Echo "Processing SPAM mails"
Call BuildList(DomainList, ExcludeList, SPAMFolders, RetainDays, TempDir, "spam")
'
' Execute file copy and sa-learn.exe - sequentially - no StdOut.
'
If Batch Then
' On Error Resume Next
If Verbose Then WScript.Echo "Copying SPAM mails"
oCMD.Run "cmd.exe /C " & TempDir & "spam.cmd", Verbose, True
If Err.Number Then
EventLog.Write ("Exception : sa-learn.vbs -> oCMD.Run SPAMCopy, 0, true")
EventLog.Write ("Error : " & Err.Number)
EventLog.Write ("Error (hex) : 0x" & Hex(Err.Number))
EventLog.Write ("Source : " & Err.Source)
EventLog.Write ("Description : " & Err.Description)
Err.Clear
End If
If Verbose Then WScript.Echo "Copying HAM mails"
oCMD.Run "cmd.exe /C " & TempDir & "ham.cmd", Verbose, True
If Err.Number Then
EventLog.Write ("Exception : sa-learn.vbs -> oCMD.Run HAMCopy, 0, true")
EventLog.Write ("Error : " & Err.Number)
EventLog.Write ("Error (hex) : 0x" & Hex(Err.Number))
EventLog.Write ("Source : " & Err.Source)
EventLog.Write ("Description : " & Err.Description)
Err.Clear
End If
If Verbose Then WScript.Echo "Starting the learning process ... "
oCMD.Run "cmd.exe /C " & SALearn, Verbose, True
If Err.Number Then
EventLog.Write ("Exception : sa-learn.vbs -> oCMD.Run SALearn, 0, true")
EventLog.Write ("Error : " & Err.Number)
EventLog.Write ("Error (hex) : 0x" & Hex(Err.Number))
EventLog.Write ("Source : " & Err.Source)
EventLog.Write ("Description : " & Err.Description)
Err.Clear
End If
' On Error Goto 0
End If
As you can see, by setting "Const Verbose = 0" it will bypass lines like:
Code: Select all
If Verbose Then WScript.Echo "Starting the learning process ... "
Re: SpamAssassin Bootcamp (sa-learn) train BAYES
You call it like this in windows scheduler: "cscript.exe" C:\path\sa-learn.vbs
Code: Select all
schtasks /create /ru "SYSTEM" /tn "SA learn" /tr "\"cscript.exe\" C:\path\sa-learn.vbs" /sc DAILY /mo 1
HMS 5.6.8 B2534.28 on Windows Server 2019 Core VM.
HMS 5.6.9 B2641.67 on Windows Server 2016 Core VM.
HMS 5.6.9 B2641.67 on Windows Server 2016 Core VM.
Re: SpamAssassin Bootcamp (sa-learn) train BAYES
This one works for me. I ran it via command prompt then attempted to run using windows scheduler and got the logs. Thanks so much guys! I really appreciate your help.tunis wrote: ↑2019-12-12 17:02You call it like this in windows scheduler: "cscript.exe" C:\path\sa-learn.vbsCode: Select all
schtasks /create /ru "SYSTEM" /tn "SA learn" /tr "\"cscript.exe\" C:\path\sa-learn.vbs" /sc DAILY /mo 1
Re: SpamAssassin Bootcamp (sa-learn) train BAYES
Did you ever get this working?SorenR wrote: ↑2018-07-26 19:21OK, this is slowly getting out of hand ...
Support for "spamc.exe -L" and "sa-learn.exe". For spamc to work you must run spamd with "--allow-tell".
When Batch = 0 (spamc) mode is selected there will be no sync of the database and no backup. Spamd is doing the sync on-the-fly and backup is on you.
sa-learn.vbs Version 0.6.0Code: Select all
oCMD.Run "cmd.exe /C spamc.exe -d " & SAHost & " -p " & SAPort & " -L " & mType & " < " & Chr(34) & oMessage.FileName & Chr(34), Verbose, True
Also, how do I run spamd with --allow-tell? Is that an argument in the spamd windows service?
Re: SpamAssassin Bootcamp (sa-learn) train BAYES
Yes and Yes ...palinka wrote: ↑2020-11-20 15:52Did you ever get this working?SorenR wrote: ↑2018-07-26 19:21OK, this is slowly getting out of hand ...
Support for "spamc.exe -L" and "sa-learn.exe". For spamc to work you must run spamd with "--allow-tell".
When Batch = 0 (spamc) mode is selected there will be no sync of the database and no backup. Spamd is doing the sync on-the-fly and backup is on you.
sa-learn.vbs Version 0.6.0Code: Select all
oCMD.Run "cmd.exe /C spamc.exe -d " & SAHost & " -p " & SAPort & " -L " & mType & " < " & Chr(34) & oMessage.FileName & Chr(34), Verbose, True
Also, how do I run spamd with --allow-tell? Is that an argument in the spamd windows service?
SørenR.
Woke is Marxism advancing through Maoist cultural revolution.
Woke is Marxism advancing through Maoist cultural revolution.
Re: SpamAssassin Bootcamp (sa-learn) train BAYES
OK cool. I'm cooking something.SorenR wrote: ↑2020-11-20 16:32Yes and Yes ...palinka wrote: ↑2020-11-20 15:52Did you ever get this working?SorenR wrote: ↑2018-07-26 19:21OK, this is slowly getting out of hand ...
Support for "spamc.exe -L" and "sa-learn.exe". For spamc to work you must run spamd with "--allow-tell".
When Batch = 0 (spamc) mode is selected there will be no sync of the database and no backup. Spamd is doing the sync on-the-fly and backup is on you.
sa-learn.vbs Version 0.6.0Code: Select all
oCMD.Run "cmd.exe /C spamc.exe -d " & SAHost & " -p " & SAPort & " -L " & mType & " < " & Chr(34) & oMessage.FileName & Chr(34), Verbose, True
Also, how do I run spamd with --allow-tell? Is that an argument in the spamd windows service?
Re: SpamAssassin Bootcamp (sa-learn) train BAYES
My test script seems to run fine even though I did not change "--allow-tell".
First operational run had errors for file too large for spamc to scan:
Code: Select all
Finished feeding 190 messages to Bayes in 1 minute 47 seconds
Learned tokens from 141 of 185 HAM messages fed to Bayes
Learned tokens from 5 of 5 SPAM messages fed to Bayes
----------------------------
Successfully synced Bayes database
Code: Select all
Finished feeding 192 messages to Bayes in 1 minute 40 seconds
Learned tokens from 1 of 187 HAM messages fed to Bayes
Learned tokens from 1 of 5 SPAM messages fed to Bayes
----------------------------
Successfully synced Bayes database
I'm trying to incorporate this into my powershell backup routine.
Test script:
Code: Select all
<#
.SYNOPSIS
Prune Messages & Feed Bayes
.DESCRIPTION
Delete messages in specified folders older than N days
Feeds messages to Bayes
.FUNCTIONALITY
Looks for folder name match at any folder level and if found, deletes all messages older than N days within that folder and all subfolders within
Deletes empty subfolders within matching folders if DeleteEmptySubFolders set to True in config
Feeds ham and spam to Bayes
.PARAMETER
.NOTES
Folder name matching occurs at any level folder
Empty folders are assumed to be trash if they're located in this script
Only empty folders found in levels BELOW matching level will be deleted
.EXAMPLE
#>
<### USER VARIABLES ###>
$hMSAdminPass = "supersecretpassword" # hMailServer Admin password
$DoDelete = $False # FOR TESTING - set to false to run and report results without deleting messages and folders
$DoSpamC = $False # FOR TESTING - set to false to run and report results without feeding SpamC with spam/ham
$PruneSubFolders = $True # True will prune all folders in levels below name matching folders
$PruneEmptySubFolders = $True # True will delete empty subfolders below the matching level unless a subfolder within contains messages
$DaysBeforeDelete = 30 # Number of days to keep messages in pruned folders
$PruneFolders = "2nd level|Trash|Deleted|Junk|Spam|APCUPSD|BrotherMFC|Administrative|Horde|SAUserList|Chase|Unsubscribes" # Names of IMAP folders you want to cleanup - uses regex
$FeedBayes = $True # Feed spamC with spam/ham to populate bayes database
$HamFolders = "INBOX|Ham" # Ham folders to feed messages to spamC for bayes database - uses regex
$SpamFolders = "Spam|Junk" # Spam folders to feed messages to spamC for bayes database - uses regex
$BayesDays = 7 # Number of days worth of spam/ham to feed to bayes
$SADir = "C:\Program Files\JAM Software\SpamAssassin for Windows" # SpamAssassin Install Directory
<### START SCRIPT ###>
<# Functions copied from hMailServer Backup required for testing #>
Function Debug ($DebugOutput) {Write-Host $DebugOutput}
Function Email ($DebugOutput) {}
Function ElapsedTime ($EndTime) {
$TimeSpan = New-Timespan $EndTime
If (([int]($TimeSpan).Hours) -eq 0) {$Hours = ""} ElseIf (([int]($TimeSpan).Hours) -eq 1) {$Hours = "1 hour "} Else {$Hours = "$([int]($TimeSpan).Hours) hours "}
If (([int]($TimeSpan).Minutes) -eq 0) {$Minutes = ""} ElseIf (([int]($TimeSpan).Minutes) -eq 1) {$Minutes = "1 minute "} Else {$Minutes = "$([int]($TimeSpan).Minutes) minutes "}
If (([int]($TimeSpan).Seconds) -eq 1) {$Seconds = "1 second"} Else {$Seconds = "$([int]($TimeSpan).Seconds) seconds"}
If (($TimeSpan).TotalSeconds -lt 1) {
$Return = "less than 1 second"
} Else {
$Return = "$Hours$Minutes$Seconds"
}
Return $Return
}
<# Set pruning variables #>
Set-Variable -Name TotalDeletedMessages -Value 0 -Option AllScope
Set-Variable -Name TotalDeletedFolders -Value 0 -Option AllScope
Set-Variable -Name DeleteMessageErrors -Value 0 -Option AllScope
Set-Variable -Name DeleteFolderErrors -Value 0 -Option AllScope
<# Set Bayes variables #>
Set-Variable -Name TotalHamFedMessages -Value 0 -Option AllScope
Set-Variable -Name TotalSpamFedMessages -Value 0 -Option AllScope
Set-Variable -Name HamFedMessageErrors -Value 0 -Option AllScope
Set-Variable -Name SpamFedMessageErrors -Value 0 -Option AllScope
Set-Variable -Name LearnedHamMessages -Value 0 -Option AllScope
Set-Variable -Name LearnedSpamMessages -Value 0 -Option AllScope
Function GetSubFolders ($Folder) {
$IterateFolder = 0
$ArrayDeletedFolders = @()
If ($Folder.SubFolders.Count -gt 0) {
Do {
$SubFolder = $Folder.SubFolders.Item($IterateFolder)
$SubFolderName = $SubFolder.Name
$SubFolderID = $SubFolder.ID
If ($SubFolder.Subfolders.Count -gt 0) {GetSubFolders $SubFolder}
If ($SubFolder.Messages.Count -gt 0) {
If ($PruneSubFolders) {GetMessages $SubFolder}
} Else {
If ($DeleteEmptySubFolders) {$ArrayDeletedFolders += $SubFolderID}
}
$IterateFolder++
} Until ($IterateFolder -eq $Folder.SubFolders.Count)
}
If ($DeleteEmptySubFolders) {
$ArrayDeletedFolders | ForEach {
$CheckFolder = $Folder.SubFolders.ItemByDBID($_)
$FolderName = $CheckFolder.Name
If (SubFoldersEmpty $CheckFolder) {
Try {
If ($DoDelete) {$Folder.SubFolders.DeleteByDBID($_)}
$TotalDeletedFolders++
Debug "Deleted empty subfolder $FolderName in $AccountAddress"
}
Catch {
$DeleteFolderErrors++
Debug "[ERROR] Deleting empty subfolder $FolderName in $AccountAddress"
Debug "[ERROR] : $Error"
}
$Error.Clear()
}
}
}
$ArrayDeletedFolders.Clear()
}
Function SubFoldersEmpty ($Folder) {
$IterateFolder = 0
If ($Folder.SubFolders.Count -gt 0) {
Do {
$SubFolder = $Folder.SubFolders.Item($IterateFolder)
If ($SubFolder.Messages.Count -gt 0) {
Return $False
Break
}
If ($SubFolder.SubFolders.Count -gt 0) {
SubFoldersEmpty $SubFolder
}
$IterateFolder++
} Until ($IterateFolder -eq $Folder.SubFolders.Count)
} Else {
Return $True
}
}
Function GetMatchFolders ($Folder) {
$IterateFolder = 0
If ($Folder.SubFolders.Count -gt 0) {
Do {
$SubFolder = $Folder.SubFolders.Item($IterateFolder)
$SubFolderName = $SubFolder.Name
If ($SubFolderName -match [regex]$PruneFolders) {
GetSubFolders $SubFolder
GetMessages $SubFolder
} Else {
GetMatchFolders $SubFolder
}
$IterateFolder++
} Until ($IterateFolder -eq $Folder.SubFolders.Count)
}
}
Function GetMessages ($Folder) {
$IterateMessage = 0
$ArrayMessagesToDelete = @()
$DeletedMessages = 0
If ($Folder.Messages.Count -gt 0) {
Do {
$Message = $Folder.Messages.Item($IterateMessage)
If ($Message.InternalDate -lt ((Get-Date).AddDays(-$DaysBeforeDelete))) {
$ArrayMessagesToDelete += $Message.ID
}
$IterateMessage++
} Until ($IterateMessage -eq $Folder.Messages.Count)
}
$ArrayMessagesToDelete | ForEach {
$AFolderName = $Folder.Name
Try {
If ($DoDelete) {$Folder.Messages.DeleteByDBID($_)}
$DeletedMessages++
$TotalDeletedMessages++
}
Catch {
$DeleteMessageErrors++
Debug "[ERROR] Deleting messages from folder $AFolderName in $AccountAddress"
Debug "[ERROR] $Error"
}
$Error.Clear()
}
If ($DeletedMessages -gt 0) {
Debug "Deleted $DeletedMessages messages from $AFolderName in $AccountAddress"
}
$ArrayMessagesToDelete.Clear()
}
Function PruneMessages {
$Error.Clear()
$BeginDeletingOldMessages = Get-Date
Debug "----------------------------"
Debug "Begin deleting messages older than $DaysBeforeDelete days"
If (-not($DoDelete)) {
Debug "Delete disabled - Test Run ONLY"
}
<# Authenticate hMailServer COM #>
$hMS = New-Object -COMObject hMailServer.Application
$hMS.Authenticate("Administrator", $hMSAdminPass) | Out-Null
$IterateDomains = 0
Do {
$hMSDomain = $hMS.Domains.Item($IterateDomains)
If ($hMSDomain.Active) {
$IterateAccounts = 0
Do {
$hMSAccount = $hMSDomain.Accounts.Item($IterateAccounts)
If ($hMSAccount.Active) {
$AccountAddress = $hMSAccount.Address
$IterateIMAPFolders = 0
If ($hMSAccount.IMAPFolders.Count -gt 0) {
Do {
$hMSIMAPFolder = $hMSAccount.IMAPFolders.Item($IterateIMAPFolders)
If ($hMSIMAPFolder.Name -match [regex]$PruneFolders) {
If ($hMSIMAPFolder.SubFolders.Count -gt 0) {
GetSubFolders $hMSIMAPFolder
} # IF SUBFOLDER COUNT > 0
GetMessages $hMSIMAPFolder
} # IF FOLDERNAME MATCH REGEX
Else {
GetMatchFolders $hMSIMAPFolder
} # IF NOT FOLDERNAME MATCH REGEX
$IterateIMAPFolders++
} Until ($IterateIMAPFolders -eq $hMSAccount.IMAPFolders.Count)
} # IF IMAPFOLDER COUNT > 0
} #IF ACCOUNT ACTIVE
$IterateAccounts++
} Until ($IterateAccounts -eq $hMSDomain.Accounts.Count)
} # IF DOMAIN ACTIVE
$IterateDomains++
} Until ($IterateDomains -eq $hMS.Domains.Count)
If ($DeleteMessageErrors -gt 0) {
Debug "Finished Message Pruning : $DeleteMessageErrors Errors present"
Email "[ERROR] Message Pruning : $DeleteMessageErrors Errors present : Check debug log"
} Else {
If ($TotalDeletedMessages -gt 0) {
Debug "Finished pruning $TotalDeletedMessages messages in $(ElapsedTime $BeginDeletingOldMessages)"
Email "[OK] Finished pruning $TotalDeletedMessages messages in $(ElapsedTime $BeginDeletingOldMessages)"
} Else {
Debug "No messages older than $DaysBeforeDelete days to prune"
Email "[OK] No messages older than $DaysBeforeDelete days to prune"
}
}
If ($DeleteFolderErrors -gt 0) {
Debug "Deleting Empty Folders : $DeleteFolderErrors Errors present"
Email "[ERROR] Deleting Empty Folders : $DeleteFolderErrors Errors present : Check debug log"
} Else {
If ($TotalDeletedFolders -gt 0) {
Debug "Deleted $TotalDeletedFolders empty subfolders"
Email "[OK] Deleted $TotalDeletedFolders empty subfolders"
} Else {
Debug "No empty subfolders deleted"
Email "[OK] No empty subfolders deleted"
}
}
}
PruneMessages
Function GetBayesSubFolders ($Folder) {
$IterateFolder = 0
$ArrayBayesMessages = @()
If ($Folder.SubFolders.Count -gt 0) {
Do {
$SubFolder = $Folder.SubFolders.Item($IterateFolder)
$SubFolderName = $SubFolder.Name
$SubFolderID = $SubFolder.ID
If ($SubFolder.Subfolders.Count -gt 0) {GetBayesSubFolders $SubFolder}
If ($SubFolder.Messages.Count -gt 0) {
If ($PruneSubFolders) {GetBayesMessages $SubFolder}
}
$IterateFolder++
} Until ($IterateFolder -eq $Folder.SubFolders.Count)
}
$ArrayBayesMessages.Clear()
}
Function GetBayesMatchFolders ($Folder) {
$IterateFolder = 0
If ($Folder.SubFolders.Count -gt 0) {
Do {
$SubFolder = $Folder.SubFolders.Item($IterateFolder)
$SubFolderName = $SubFolder.Name
If (($SubFolderName -match [regex]$HamFolders) -or ($SubFolderName -match [regex]$SpamFolders)) {
GetBayesSubFolders $SubFolder
GetBayesMessages $SubFolder
} Else {
GetBayesMatchFolders $SubFolder
}
$IterateFolder++
} Until ($IterateFolder -eq $Folder.SubFolders.Count)
}
}
Function GetBayesMessages ($Folder) {
$IterateMessage = 0
$ArrayHamToFeed = @()
$ArraySpamToFeed = @()
$HamFedMessages = 0
$SpamFedMessages = 0
If ($Folder.Messages.Count -gt 0) {
If ($Folder.Name -match [regex]$HamFolders) {
Do {
$Message = $Folder.Messages.Item($IterateMessage)
If ($Message.InternalDate -gt ((Get-Date).AddDays(-$BayesDays))) {
$ArrayHamToFeed += $Message.FileName
}
$IterateMessage++
} Until ($IterateMessage -eq $Folder.Messages.Count)
}
If ($Folder.Name -match [regex]$SpamFolders) {
Do {
$Message = $Folder.Messages.Item($IterateMessage)
If ($Message.InternalDate -gt ((Get-Date).AddDays(-$BayesDays))) {
$ArraySpamToFeed += $Message.FileName
}
$IterateMessage++
} Until ($IterateMessage -eq $Folder.Messages.Count)
}
}
$ArrayHamToFeed | ForEach {
$FileName = $_
Try {
If ((Get-Item $FileName).Length -lt 512000) {
If ($DoSpamC) {
$SpamC = & cmd /c "`"$SADir\spamc.exe`" -d `"$SAHost`" -p `"$SAPort`" -L ham < `"$FileName`""
$SpamCResult = Out-String -InputObject $SpamC
If ($SpamCResult -match "Message successfully un/learned") {$LearnedHamMessages++}
If (($SpamCResult -notmatch "Message successfully un/learned") -and ($SpamCResult -notmatch "Message was already un/learned")) {
Throw $SpamCResult
}
}
$HamFedMessages++
$TotalHamFedMessages++
}
}
Catch {
$HamFed0MessageErrors++
$Err = $Error[0]
Debug "[ERROR] Feeding HAM message $FileName in $AccountAddress"
Debug "[ERROR] $Err"
}
}
$ArraySpamToFeed | ForEach {
$FileName = $_
Try {
If ((Get-Item $FileName).Length -lt 512000) {
If ($DoSpamC) {
$SpamC = & cmd /c "`"$SADir\spamc.exe`" -d `"$SAHost`" -p `"$SAPort`" -L spam < `"$FileName`""
$SpamCResult = Out-String -InputObject $SpamC
If ($SpamCResult -match "Message successfully un/learned") {$LearnedSpamMessages++}
If (($SpamCResult -notmatch "Message successfully un/learned") -and ($SpamCResult -notmatch "Message was already un/learned")) {
Throw $SpamCResult
}
}
$SpamFedMessages++
$TotalSpamFedMessages++
}
}
Catch {
$SpamFed0MessageErrors++
$Err = $Error[0]
Debug "[ERROR] Feeding SPAM message $FileName in $AccountAddress"
Debug "[ERROR] $Err"
}
}
If ($HamFedMessages -gt 0) {
Debug "Fed $HamFedMessages HAM messages to SpamC from $AccountAddress"
}
If ($SpamFedMessages -gt 0) {
Debug "Fed $SpamFedMessages SPAM messages to SpamC from $AccountAddress"
}
$ArraySpamToFeed.Clear()
}
Function FeedBayes {
$Error.Clear()
$BeginFeedingBayes = Get-Date
Debug "----------------------------"
Debug "Begin deleting messages older than $DaysBeforeDelete days"
If (-not($DoSpamC)) {
Debug "SpamC disabled - Test Run ONLY"
}
<# Authenticate hMailServer COM #>
$hMS = New-Object -COMObject hMailServer.Application
$hMS.Authenticate("Administrator", $hMSAdminPass) | Out-Null
$SAHost = $hMS.Settings.AntiSpam.SpamAssassinHost
$SAPort = $hMS.Settings.AntiSpam.SpamAssassinPort
$IterateDomains = 0
Do {
$hMSDomain = $hMS.Domains.Item($IterateDomains)
If ($hMSDomain.Active) {
$IterateAccounts = 0
Do {
$hMSAccount = $hMSDomain.Accounts.Item($IterateAccounts)
If ($hMSAccount.Active) {
$AccountAddress = $hMSAccount.Address
$IterateIMAPFolders = 0
If ($hMSAccount.IMAPFolders.Count -gt 0) {
Do {
$hMSIMAPFolder = $hMSAccount.IMAPFolders.Item($IterateIMAPFolders)
If (($hMSIMAPFolder.Name -match [regex]$HamFolders) -or ($hMSIMAPFolder.Name -match [regex]$SpamFolders)) {
If ($hMSIMAPFolder.SubFolders.Count -gt 0) {
GetBayesSubFolders $hMSIMAPFolder
} # IF SUBFOLDER COUNT > 0
GetBayesMessages $hMSIMAPFolder
} # IF FOLDERNAME MATCH REGEX
Else {
GetBayesMatchFolders $hMSIMAPFolder
} # IF NOT FOLDERNAME MATCH REGEX
$IterateIMAPFolders++
} Until ($IterateIMAPFolders -eq $hMSAccount.IMAPFolders.Count)
} # IF IMAPFOLDER COUNT > 0
} #IF ACCOUNT ACTIVE
$IterateAccounts++
} Until ($IterateAccounts -eq $hMSDomain.Accounts.Count)
} # IF DOMAIN ACTIVE
$IterateDomains++
} Until ($IterateDomains -eq $hMS.Domains.Count)
Debug "----------------------------"
Debug "Finished feeding $($TotalHamFedMessages + $TotalSpamFedMessages) messages to Bayes in $(ElapsedTime $BeginFeedingBayes)"
If ($HamFedMessageErrors -gt 0) {
Debug "Errors feeding HAM to SpamC : $HamFedMessageErrors Errors present"
Email "[ERROR] HAM SpamC : $HamFedMessageErrors Errors present : Check debug log"
} Else {
If ($TotalHamFedMessages -gt 0) {
Debug "Learned tokens from $LearnedHamMessages of $TotalHamFedMessages HAM messages fed to Bayes"
Email "[OK] Learned tokens from $LearnedHamMessages of $TotalHamFedMessages HAM messages fed to Bayes"
} Else {
Debug "No HAM messages older than $BayesDays days to feed to Bayes"
Email "[OK] No HAM messages older than $BayesDays days to feed to Bayes"
}
}
If ($SpamFedMessageErrors -gt 0) {
Debug "Errors feeding SPAM to SpamC : $SpamFedMessageErrors Errors present"
Email "[ERROR] SPAM SpamC : $SpamFedMessageErrors Errors present : Check debug log"
} Else {
If ($TotalSpamFedMessages -gt 0) {
Debug "Learned tokens from $LearnedSpamMessages of $TotalSpamFedMessages SPAM messages fed to Bayes"
Email "[OK] Learned tokens from $LearnedSpamMessages of $TotalSpamFedMessages SPAM messages fed to Bayes"
} Else {
Debug "No SPAM messages older than $BayesDays days to feed to Bayes"
Email "[OK] No SPAM messages older than $BayesDays days to feed to Bayes"
}
}
}
FeedBayes
Try {
& cmd /c "`"$SADir\sa-learn.exe`" --sync"
Debug "----------------------------"
Debug "Successfully synced Bayes database"
}
Catch {
$Err = $Error[0]
Debug "----------------------------"
Debug "[ERROR] syncing Bayes : $Err"
}
Re: SpamAssassin Bootcamp (sa-learn) train BAYES
Ehhh.... Never mind. Just took a dump.
At least I know what to do now.
Code: Select all
0.000 0 0 0 non-token data: last journal sync atime
Re: SpamAssassin Bootcamp (sa-learn) train BAYES
Nope, I can see you received 2 new message, 1 of the 2 new messages was used to learn some HAM tokens and one SPAM message was re-visitedpalinka wrote: ↑2020-11-20 17:38
First operational run had errors for file too large for spamc to scan:Then I changed the script to ignore too-large files (> 512k) and got this:Code: Select all
Finished feeding 190 messages to Bayes in 1 minute 47 seconds Learned tokens from 141 of 185 HAM messages fed to Bayes Learned tokens from 5 of 5 SPAM messages fed to Bayes ---------------------------- Successfully synced Bayes database
Do you think I'm missing something?Code: Select all
Finished feeding 192 messages to Bayes in 1 minute 40 seconds Learned tokens from 1 of 187 HAM messages fed to Bayes Learned tokens from 1 of 5 SPAM messages fed to Bayes ---------------------------- Successfully synced Bayes database
SørenR.
Woke is Marxism advancing through Maoist cultural revolution.
Woke is Marxism advancing through Maoist cultural revolution.
Re: SpamAssassin Bootcamp (sa-learn) train BAYES
But spamd will not use these tokens unless the journal is synced - correct?
I think the tokens are being placed into the db, but not being used to score spam.
I need that --allow-tell...
Re: SpamAssassin Bootcamp (sa-learn) train BAYES
Code: Select all
# If this option is set, whenever SpamAssassin does Bayes learning, it will
# put the information into the journal instead of directly into the database.
# This lowers contention for locking the database to execute an update, but
# will also cause more access to the journal and cause a delay before the
# updates are actually committed to the Bayes database.
#
# bayes_learn_to_journal (default: 0)
Code: Select all
-l, --allow-tell
Allow learning and forgetting (to a local Bayes database), reporting and revoking (to a remote database) by spamd. The client issues a TELL command to tell what type of message is being processed and whether local (learn/forget) or remote (report/revoke) databases should be updated.
SørenR.
Woke is Marxism advancing through Maoist cultural revolution.
Woke is Marxism advancing through Maoist cultural revolution.
Re: SpamAssassin Bootcamp (sa-learn) train BAYES
Now I'm really confused. Do I need to re-set my spamd service with argument --allow-tell or not?SorenR wrote: ↑2020-11-20 18:09Code: Select all
# If this option is set, whenever SpamAssassin does Bayes learning, it will # put the information into the journal instead of directly into the database. # This lowers contention for locking the database to execute an update, but # will also cause more access to the journal and cause a delay before the # updates are actually committed to the Bayes database. # # bayes_learn_to_journal (default: 0)
No Journal involved.Code: Select all
-l, --allow-tell Allow learning and forgetting (to a local Bayes database), reporting and revoking (to a remote database) by spamd. The client issues a TELL command to tell what type of message is being processed and whether local (learn/forget) or remote (report/revoke) databases should be updated.
It looks like its working, but I'm not sure and I'm not even sure how to test it.
Re: SpamAssassin Bootcamp (sa-learn) train BAYES
Add "--allow-tell" to your service to allow SPAMC to report SPAM/HAM.
SørenR.
Woke is Marxism advancing through Maoist cultural revolution.
Woke is Marxism advancing through Maoist cultural revolution.
Re: SpamAssassin Bootcamp (sa-learn) train BAYES
OK, here's my version. Sorry about the commingled script for deleting old messages - I'm too lazy to untangle it, so I'm just pasting it in its entirety. This will be in my backup & offsite upload routine shortly.
Code: Select all
<#
.SYNOPSIS
Prune Messages & Feed Bayes
.DESCRIPTION
Delete messages in specified folders older than N days
Feeds messages to Bayes
.FUNCTIONALITY
Looks for folder name match at any folder level and if found, deletes all messages older than N days within that folder and all subfolders within
Deletes empty subfolders within matching folders if DeleteEmptySubFolders set to True in config
Feeds ham and spam to Bayes
.PARAMETER
.NOTES
Folder name matching occurs at any level folder
Empty folders are assumed to be trash if they're located in this script
Only empty folders found in levels BELOW matching level will be deleted
.EXAMPLE
#>
<### USER VARIABLES ###>
$hMSAdminPass = 'supersecretpassword' # hMailServer Admin password
$DoDelete = $False # FOR TESTING - set to false to run and report results without deleting messages and folders
$DoSpamC = $True # FOR TESTING - set to false to run and report results without feeding SpamC with spam/ham
$PruneSubFolders = $True # True will prune all folders in levels below name matching folders
$PruneEmptySubFolders = $True # True will delete empty subfolders below the matching level unless a subfolder within contains messages
$DaysBeforeDelete = 30 # Number of days to keep messages in pruned folders
$PruneFolders = '2nd level|Trash|Deleted|Junk|Spam|APCUPSD|BrotherMFC|Administrative|Horde|SAUserList|Chase|Unsubscribes' # Names of IMAP folders you want to cleanup - uses regex
$FeedBayes = $True # Feed spamC with spam/ham to populate bayes database
$HamFolders = 'Inbox|Ham' # Ham folders to feed messages to spamC for bayes database - uses regex
$SpamFolders = 'Spam|Junk' # Spam folders to feed messages to spamC for bayes database - uses regex
$BayesDays = 7 # Number of days worth of spam/ham to feed to bayes
$SADir = 'C:\Program Files\JAM Software\SpamAssassin for Windows' # SpamAssassin Install Directory
<### START SCRIPT ###>
<# Functions copied from hMailServer Backup required for testing #>
Function Debug ($DebugOutput) {Write-Host $DebugOutput}
Function Email ($DebugOutput) {}
Function ElapsedTime ($EndTime) {
$TimeSpan = New-Timespan $EndTime
If (([int]($TimeSpan).Hours) -eq 0) {$Hours = ""} ElseIf (([int]($TimeSpan).Hours) -eq 1) {$Hours = "1 hour "} Else {$Hours = "$([int]($TimeSpan).Hours) hours "}
If (([int]($TimeSpan).Minutes) -eq 0) {$Minutes = ""} ElseIf (([int]($TimeSpan).Minutes) -eq 1) {$Minutes = "1 minute "} Else {$Minutes = "$([int]($TimeSpan).Minutes) minutes "}
If (([int]($TimeSpan).Seconds) -eq 1) {$Seconds = "1 second"} Else {$Seconds = "$([int]($TimeSpan).Seconds) seconds"}
If (($TimeSpan).TotalSeconds -lt 1) {
$Return = "less than 1 second"
} Else {
$Return = "$Hours$Minutes$Seconds"
}
Return $Return
}
Function Plural ($Integer) {
If ($Integer -eq 1) {$S = ""} Else {$S = "s"}
Return $S
}
<# Set pruning variables #>
Set-Variable -Name TotalDeletedMessages -Value 0 -Option AllScope
Set-Variable -Name TotalDeletedFolders -Value 0 -Option AllScope
Set-Variable -Name DeleteMessageErrors -Value 0 -Option AllScope
Set-Variable -Name DeleteFolderErrors -Value 0 -Option AllScope
<# Set Bayes variables #>
Set-Variable -Name TotalHamFedMessages -Value 0 -Option AllScope
Set-Variable -Name TotalSpamFedMessages -Value 0 -Option AllScope
Set-Variable -Name HamFedMessageErrors -Value 0 -Option AllScope
Set-Variable -Name SpamFedMessageErrors -Value 0 -Option AllScope
Set-Variable -Name LearnedHamMessages -Value 0 -Option AllScope
Set-Variable -Name LearnedSpamMessages -Value 0 -Option AllScope
Function GetSubFolders ($Folder) {
$IterateFolder = 0
$ArrayDeletedFolders = @()
If ($Folder.SubFolders.Count -gt 0) {
Do {
$SubFolder = $Folder.SubFolders.Item($IterateFolder)
$SubFolderName = $SubFolder.Name
$SubFolderID = $SubFolder.ID
If ($SubFolder.Subfolders.Count -gt 0) {GetSubFolders $SubFolder}
If ($SubFolder.Messages.Count -gt 0) {
If ($PruneSubFolders) {GetMessages $SubFolder}
} Else {
If ($DeleteEmptySubFolders) {$ArrayDeletedFolders += $SubFolderID}
}
$IterateFolder++
} Until ($IterateFolder -eq $Folder.SubFolders.Count)
}
If ($DeleteEmptySubFolders) {
$ArrayDeletedFolders | ForEach {
$CheckFolder = $Folder.SubFolders.ItemByDBID($_)
$FolderName = $CheckFolder.Name
If (SubFoldersEmpty $CheckFolder) {
Try {
If ($DoDelete) {$Folder.SubFolders.DeleteByDBID($_)}
$TotalDeletedFolders++
Debug "Deleted empty subfolder $FolderName in $AccountAddress"
}
Catch {
$DeleteFolderErrors++
Debug "[ERROR] Deleting empty subfolder $FolderName in $AccountAddress"
Debug "[ERROR] : $Error"
}
$Error.Clear()
}
}
}
$ArrayDeletedFolders.Clear()
}
Function SubFoldersEmpty ($Folder) {
$IterateFolder = 0
If ($Folder.SubFolders.Count -gt 0) {
Do {
$SubFolder = $Folder.SubFolders.Item($IterateFolder)
If ($SubFolder.Messages.Count -gt 0) {
Return $False
Break
}
If ($SubFolder.SubFolders.Count -gt 0) {
SubFoldersEmpty $SubFolder
}
$IterateFolder++
} Until ($IterateFolder -eq $Folder.SubFolders.Count)
} Else {
Return $True
}
}
Function GetMatchFolders ($Folder) {
$IterateFolder = 0
If ($Folder.SubFolders.Count -gt 0) {
Do {
$SubFolder = $Folder.SubFolders.Item($IterateFolder)
$SubFolderName = $SubFolder.Name
If ($SubFolderName -match $PruneFolders) {
GetSubFolders $SubFolder
GetMessages $SubFolder
} Else {
GetMatchFolders $SubFolder
}
$IterateFolder++
} Until ($IterateFolder -eq $Folder.SubFolders.Count)
}
}
Function GetMessages ($Folder) {
$IterateMessage = 0
$ArrayMessagesToDelete = @()
$DeletedMessages = 0
If ($Folder.Messages.Count -gt 0) {
Do {
$Message = $Folder.Messages.Item($IterateMessage)
If ($Message.InternalDate -lt ((Get-Date).AddDays(-$DaysBeforeDelete))) {
$ArrayMessagesToDelete += $Message.ID
}
$IterateMessage++
} Until ($IterateMessage -eq $Folder.Messages.Count)
}
$ArrayMessagesToDelete | ForEach {
$AFolderName = $Folder.Name
Try {
If ($DoDelete) {$Folder.Messages.DeleteByDBID($_)}
$DeletedMessages++
$TotalDeletedMessages++
}
Catch {
$DeleteMessageErrors++
Debug "[ERROR] Deleting messages from folder $AFolderName in $AccountAddress"
Debug "[ERROR] $Error"
}
$Error.Clear()
}
If ($DeletedMessages -gt 0) {
Debug "Deleted $DeletedMessages message$(Plural $DeletedMessages) from $AFolderName in $AccountAddress"
}
$ArrayMessagesToDelete.Clear()
}
Function PruneMessages {
$Error.Clear()
$BeginDeletingOldMessages = Get-Date
Debug "----------------------------"
Debug "Begin deleting messages older than $DaysBeforeDelete days"
If (-not($DoDelete)) {
Debug "Delete disabled - Test Run ONLY"
}
<# Authenticate hMailServer COM #>
$hMS = New-Object -COMObject hMailServer.Application
$hMS.Authenticate("Administrator", $hMSAdminPass) | Out-Null
$IterateDomains = 0
Do {
$hMSDomain = $hMS.Domains.Item($IterateDomains)
If ($hMSDomain.Active) {
$IterateAccounts = 0
Do {
$hMSAccount = $hMSDomain.Accounts.Item($IterateAccounts)
If ($hMSAccount.Active) {
$AccountAddress = $hMSAccount.Address
$IterateIMAPFolders = 0
If ($hMSAccount.IMAPFolders.Count -gt 0) {
Do {
$hMSIMAPFolder = $hMSAccount.IMAPFolders.Item($IterateIMAPFolders)
If ($hMSIMAPFolder.Name -match $PruneFolders) {
If ($hMSIMAPFolder.SubFolders.Count -gt 0) {
GetSubFolders $hMSIMAPFolder
} # IF SUBFOLDER COUNT > 0
GetMessages $hMSIMAPFolder
} # IF FOLDERNAME MATCH REGEX
Else {
GetMatchFolders $hMSIMAPFolder
} # IF NOT FOLDERNAME MATCH REGEX
$IterateIMAPFolders++
} Until ($IterateIMAPFolders -eq $hMSAccount.IMAPFolders.Count)
} # IF IMAPFOLDER COUNT > 0
} #IF ACCOUNT ACTIVE
$IterateAccounts++
} Until ($IterateAccounts -eq $hMSDomain.Accounts.Count)
} # IF DOMAIN ACTIVE
$IterateDomains++
} Until ($IterateDomains -eq $hMS.Domains.Count)
If ($DeleteMessageErrors -gt 0) {
Debug "Finished Message Pruning : $DeleteMessageErrors Errors present"
Email "[ERROR] Message Pruning : $DeleteMessageErrors Errors present : Check debug log"
} Else {
If ($TotalDeletedMessages -gt 0) {
Debug "Finished pruning $TotalDeletedMessages messages in $(ElapsedTime $BeginDeletingOldMessages)"
Email "[OK] Finished pruning $TotalDeletedMessages messages in $(ElapsedTime $BeginDeletingOldMessages)"
} Else {
Debug "No messages older than $DaysBeforeDelete days to prune"
Email "[OK] No messages older than $DaysBeforeDelete days to prune"
}
}
If ($DeleteFolderErrors -gt 0) {
Debug "Deleting Empty Folders : $DeleteFolderErrors Errors present"
Email "[ERROR] Deleting Empty Folders : $DeleteFolderErrors Errors present : Check debug log"
} Else {
If ($TotalDeletedFolders -gt 0) {
Debug "Deleted $TotalDeletedFolders empty subfolders"
Email "[OK] Deleted $TotalDeletedFolders empty subfolders"
} Else {
Debug "No empty subfolders deleted"
Email "[OK] No empty subfolders deleted"
}
}
}
<# Feed Bayes #>
Function GetBayesSubFolders ($Folder) {
$IterateFolder = 0
$ArrayBayesMessages = @()
If ($Folder.SubFolders.Count -gt 0) {
Do {
$SubFolder = $Folder.SubFolders.Item($IterateFolder)
$SubFolderName = $SubFolder.Name
$SubFolderID = $SubFolder.ID
If ($SubFolder.Subfolders.Count -gt 0) {GetBayesSubFolders $SubFolder}
If ($SubFolder.Messages.Count -gt 0) {
If ($PruneSubFolders) {GetBayesMessages $SubFolder}
}
$IterateFolder++
} Until ($IterateFolder -eq $Folder.SubFolders.Count)
}
$ArrayBayesMessages.Clear()
}
Function GetBayesMatchFolders ($Folder) {
$IterateFolder = 0
If ($Folder.SubFolders.Count -gt 0) {
Do {
$SubFolder = $Folder.SubFolders.Item($IterateFolder)
$SubFolderName = $SubFolder.Name
If (($SubFolderName -match $HamFolders) -or ($SubFolderName -match $SpamFolders)) {
GetBayesSubFolders $SubFolder
GetBayesMessages $SubFolder
} Else {
GetBayesMatchFolders $SubFolder
}
$IterateFolder++
} Until ($IterateFolder -eq $Folder.SubFolders.Count)
}
}
Function GetBayesMessages ($Folder) {
$IterateMessage = 0
$ArrayHamToFeed = @()
$ArraySpamToFeed = @()
$HamFedMessages = 0
$SpamFedMessages = 0
$FolderName = $Folder.Name
If ($Folder.Messages.Count -gt 0) {
If ($Folder.Name -match $HamFolders) {
Do {
$Message = $Folder.Messages.Item($IterateMessage)
If ($Message.InternalDate -gt ((Get-Date).AddDays(-$BayesDays))) {
$ArrayHamToFeed += $Message.FileName
}
$IterateMessage++
} Until ($IterateMessage -eq $Folder.Messages.Count)
}
If ($Folder.Name -match $SpamFolders) {
Do {
$Message = $Folder.Messages.Item($IterateMessage)
If ($Message.InternalDate -gt ((Get-Date).AddDays(-$BayesDays))) {
$ArraySpamToFeed += $Message.FileName
}
$IterateMessage++
} Until ($IterateMessage -eq $Folder.Messages.Count)
}
}
$ArrayHamToFeed | ForEach {
$FileName = $_
Try {
If ((Get-Item $FileName).Length -lt 512000) {
If ($DoSpamC) {
$SpamC = & cmd /c "`"$SADir\spamc.exe`" -d `"$SAHost`" -p `"$SAPort`" -L ham < `"$FileName`""
$SpamCResult = Out-String -InputObject $SpamC
If ($SpamCResult -match "Message successfully un/learned") {$LearnedHamMessages++}
If (($SpamCResult -notmatch "Message successfully un/learned") -and ($SpamCResult -notmatch "Message was already un/learned")) {
Throw $SpamCResult
}
}
$HamFedMessages++
$TotalHamFedMessages++
}
}
Catch {
$HamFed0MessageErrors++
$Err = $Error[0]
Debug "[ERROR] Feeding HAM message $FileName in $AccountAddress"
Debug "[ERROR] $Err"
}
}
$ArraySpamToFeed | ForEach {
$FileName = $_
Try {
If ((Get-Item $FileName).Length -lt 512000) {
If ($DoSpamC) {
$SpamC = & cmd /c "`"$SADir\spamc.exe`" -d `"$SAHost`" -p `"$SAPort`" -L spam < `"$FileName`""
$SpamCResult = Out-String -InputObject $SpamC
If ($SpamCResult -match "Message successfully un/learned") {$LearnedSpamMessages++}
If (($SpamCResult -notmatch "Message successfully un/learned") -and ($SpamCResult -notmatch "Message was already un/learned")) {
Throw $SpamCResult
}
}
$SpamFedMessages++
$TotalSpamFedMessages++
}
}
Catch {
$SpamFed0MessageErrors++
$Err = $Error[0]
Debug "[ERROR] Feeding SPAM message $FileName in $AccountAddress"
Debug "[ERROR] $Err"
}
}
If ($HamFedMessages -gt 0) {
Debug "Fed $HamFedMessages HAM message$(Plural $HamFedMessages) from $FolderName in $AccountAddress"
}
If ($SpamFedMessages -gt 0) {
Debug "Fed $SpamFedMessages SPAM message$(Plural $SpamFedMessages) from $FolderName in $AccountAddress"
}
$ArraySpamToFeed.Clear()
}
Function FeedBayes {
$Error.Clear()
$BeginFeedingBayes = Get-Date
Debug "----------------------------"
Debug "Begin feeding SpamC"
If (-not($DoSpamC)) {
Debug "SpamC disabled - Test Run ONLY"
}
<# Authenticate hMailServer COM #>
$hMS = New-Object -COMObject hMailServer.Application
$hMS.Authenticate("Administrator", $hMSAdminPass) | Out-Null
$SAHost = $hMS.Settings.AntiSpam.SpamAssassinHost
$SAPort = $hMS.Settings.AntiSpam.SpamAssassinPort
$IterateDomains = 0
Do {
$hMSDomain = $hMS.Domains.Item($IterateDomains)
If ($hMSDomain.Active) {
$IterateAccounts = 0
Do {
$hMSAccount = $hMSDomain.Accounts.Item($IterateAccounts)
If ($hMSAccount.Active) {
$AccountAddress = $hMSAccount.Address
$IterateIMAPFolders = 0
If ($hMSAccount.IMAPFolders.Count -gt 0) {
Do {
$hMSIMAPFolder = $hMSAccount.IMAPFolders.Item($IterateIMAPFolders)
If (($hMSIMAPFolder.Name -match $HamFolders) -or ($hMSIMAPFolder.Name -match $SpamFolders)) {
If ($hMSIMAPFolder.SubFolders.Count -gt 0) {
GetBayesSubFolders $hMSIMAPFolder
} # IF SUBFOLDER COUNT > 0
GetBayesMessages $hMSIMAPFolder
} # IF FOLDERNAME MATCH REGEX
Else {
GetBayesMatchFolders $hMSIMAPFolder
} # IF NOT FOLDERNAME MATCH REGEX
$IterateIMAPFolders++
} Until ($IterateIMAPFolders -eq $hMSAccount.IMAPFolders.Count)
} # IF IMAPFOLDER COUNT > 0
} #IF ACCOUNT ACTIVE
$IterateAccounts++
} Until ($IterateAccounts -eq $hMSDomain.Accounts.Count)
} # IF DOMAIN ACTIVE
$IterateDomains++
} Until ($IterateDomains -eq $hMS.Domains.Count)
Debug "----------------------------"
Debug "Finished feeding $($TotalHamFedMessages + $TotalSpamFedMessages) messages to Bayes in $(ElapsedTime $BeginFeedingBayes)"
If ($HamFedMessageErrors -gt 0) {
Debug "Errors feeding HAM to SpamC : $HamFedMessageErrors Error$(Plural $HamFedMessageErrors) present"
Email "[ERROR] HAM SpamC : $HamFedMessageErrors Errors present : Check debug log"
} Else {
If ($TotalHamFedMessages -gt 0) {
Debug "Bayes learned from $LearnedHamMessages of $TotalHamFedMessages HAM message$(Plural $TotalHamFedMessages) found"
Email "[OK] Bayes learned from $LearnedHamMessages of $TotalHamFedMessages HAM message$(Plural $TotalHamFedMessages) found"
} Else {
Debug "No HAM messages older than $BayesDays days to feed to Bayes"
Email "[OK] No HAM messages older than $BayesDays days to feed to Bayes"
}
}
If ($SpamFedMessageErrors -gt 0) {
Debug "Errors feeding SPAM to SpamC : $SpamFedMessageErrors Error$(Plural $SpamFedMessageErrors) present"
Email "[ERROR] SPAM SpamC : $SpamFedMessageErrors Errors present : Check debug log"
} Else {
If ($TotalSpamFedMessages -gt 0) {
Debug "Bayes learned from $LearnedSpamMessages of $TotalSpamFedMessages SPAM message$(Plural $TotalSpamFedMessages) found"
Email "[OK] Bayes learned from $LearnedSpamMessages of $TotalSpamFedMessages SPAM message$(Plural $TotalSpamFedMessages) found"
} Else {
Debug "No SPAM messages older than $BayesDays days to feed to Bayes"
Email "[OK] No SPAM messages older than $BayesDays days to feed to Bayes"
}
}
Try {
& cmd /c "`"$SADir\sa-learn.exe`" --backup > `"X:\sa-learn\.spamassassin\bayes_backup`""
Debug "----------------------------"
Debug "Successfully backed up Bayes database"
}
Catch {
$Err = $Error[0]
Debug "----------------------------"
Debug "[ERROR] backing up Bayes : $Err"
}
}
PruneMessages
FeedBayes
Last edited by palinka on 2020-11-21 00:50, edited 1 time in total.
Re: SpamAssassin Bootcamp (sa-learn) train BAYES
sa-learn is used for batch learning as it can reference an entire directory thus using the journal make sense not to lock the database for the duration of processing the directory. "--sync" is required to update the database from the journal and activate the learned tokens.
spamc however is normally used as a one-off learning process and is done by directly addressing the database.
If SpamAssassin is running on a different host it is more efficient to "net use" the drive with the email folders and use sa-learn thus putting the major stress on the network and not the the Bayes database.
SørenR.
Woke is Marxism advancing through Maoist cultural revolution.
Woke is Marxism advancing through Maoist cultural revolution.
Re: SpamAssassin Bootcamp (sa-learn) train BAYES
Got it.SorenR wrote: ↑2020-11-21 00:47sa-learn is used for batch learning as it can reference an entire directory thus using the journal make sense not to lock the database for the duration of processing the directory. "--sync" is required to update the database from the journal and activate the learned tokens.
spamc however is normally used as a one-off learning process and is done by directly addressing the database.
If SpamAssassin is running on a different host it is more efficient to "net use" the drive with the email folders and use sa-learn thus putting the major stress on the network and not the the Bayes database.
Autolearn is starting to make sense now as well. I'm not using it, but a lot of knowledge gaps have been filled in.
Re: SpamAssassin Bootcamp (sa-learn) train BAYES
I did my monthly cleanup a couple of days ago...
Regular clients get to keep 30 days of SPAM in case they forgot to move a message to/from the SPAM folder and 30 days of deleted messages. Non-xpunged messages are xpunged in all mail folders.
My SPAM account get to keep 6 months worth of faul language ....
Received by the SPAM account ... SPAM account get a copy of ALL SPAM.
Tagged by SpamAssassin: 879
Not tagged by SpamAssassin: 93
Tagged by SpamAssassin AND hMailserver: 1991
Not having analysed the 1991 emails I believe I could rely on SpamAssasson for 95% of my SPAM fighting. That is the result of a well-trained Bayes database.
EDIT: Actually 1974 of the 1991 are tagged by SpamAssassin ...
Regular clients get to keep 30 days of SPAM in case they forgot to move a message to/from the SPAM folder and 30 days of deleted messages. Non-xpunged messages are xpunged in all mail folders.
My SPAM account get to keep 6 months worth of faul language ....
Received by the SPAM account ... SPAM account get a copy of ALL SPAM.
Tagged by SpamAssassin: 879
Not tagged by SpamAssassin: 93
Tagged by SpamAssassin AND hMailserver: 1991
Not having analysed the 1991 emails I believe I could rely on SpamAssasson for 95% of my SPAM fighting. That is the result of a well-trained Bayes database.
EDIT: Actually 1974 of the 1991 are tagged by SpamAssassin ...
SørenR.
Woke is Marxism advancing through Maoist cultural revolution.
Woke is Marxism advancing through Maoist cultural revolution.
Re: SpamAssassin Bootcamp (sa-learn) train BAYES
Not with all those delicious spam filters you came up with. My spam fighting is highly tilted toward reject first, ask questions later.
Re: SpamAssassin Bootcamp (sa-learn) train BAYES
Well... I reject non-RFC compliant HELO/EHLO greetings, and the following list of TLD's:
.top, .xyz, .icu, .best, .ga, .club, .press, .today, .guru, .casa, .tk, .ml, .work, .buzz, .co, .monster, .cyou
Besides any addresses flagged as Snowshoe SPAM or Lashback SPAM.
Everything else is fed to the SpamAssassin beast and the ones that survive are delivered to my clients, the rest to my SPAM account. I have a procedure for false-positives in my SPAM account however my clients handle their own false-positives by moving messages in or out of their SPAM folders.
Sometimes I do have to intervene manually and whitelist messages if flagged by RBL or SURBL.
It's a work-in-progress and you never finish
I have blacklists that I use for small amounts of non-found SPAM - it takes about 1-2 weeks before SpamAssassin have properly learned the non-found SPAM and so I can deactivate the entry in my blacklist.
My black- and whitelists have date fields and hit counts that enable me to clean up unused entries after a while. I simply deactivate any unused entries but leave them in the list so I later can activate them if problem comes back.
SørenR.
Woke is Marxism advancing through Maoist cultural revolution.
Woke is Marxism advancing through Maoist cultural revolution.
Re: SpamAssassin Bootcamp (sa-learn) train BAYES
Yep. I got all that too.SorenR wrote: ↑2020-11-21 13:39Well... I reject non-RFC compliant HELO/EHLO greetings, and the following list of TLD's:
.top, .xyz, .icu, .best, .ga, .club, .press, .today, .guru, .casa, .tk, .ml, .work, .buzz, .co, .monster, .cyou
Besides any addresses flagged as Snowshoe SPAM or Lashback SPAM.
Everything else is fed to the SpamAssassin beast and the ones that survive are delivered to my clients, the rest to my SPAM account. I have a procedure for false-positives in my SPAM account however my clients handle their own false-positives by moving messages in or out of their SPAM folders.
Sometimes I do have to intervene manually and whitelist messages if flagged by RBL or SURBL.
It's a work-in-progress and you never finish
I have blacklists that I use for small amounts of non-found SPAM - it takes about 1-2 weeks before SpamAssassin have properly learned the non-found SPAM and so I can deactivate the entry in my blacklist.
My black- and whitelists have date fields and hit counts that enable me to clean up unused entries after a while. I simply deactivate any unused entries but leave them in the list so I later can activate them if problem comes back.
Thanks to copy/paste your scripts.
Actually, you taught me so much, I'm doing my own thing now. We can collaborate finally instead of the one-way street it used to be.
You should check out my backup routine. Its evolving into a one-stop-shop for hmailserver daily maintenance. Bayes training now included!
Re: SpamAssassin Bootcamp (sa-learn) train BAYES
Here you go - untangled version. Bayes feeding only script with a couple of minor fix-ups. Stick a fork in it. Its done.
Code: Select all
<#
.SYNOPSIS
Feed Bayes
.DESCRIPTION
Feeds messages to Bayes for spam/ham learning
.FUNCTIONALITY
Looks for folder name match at any folder level and if found, feeds messages to spamc for learning
.PARAMETER
.NOTES
Add "--allow-tell" argument to your SPAMD service to allow SPAMC to report SPAM/HAM
.EXAMPLE
#>
<### USER VARIABLES ###>
$hMSAdminPass = 'supersecretpassword' # hMailServer Admin password
$DoSpamC = $True # FOR TESTING - set to false to run and report results without feeding SpamC with spam/ham
$BayesSubFolders = $True # True will feed messages from regex name matching subfolders
$HamFolders = 'Inbox|Ham' # Ham folders to feed messages to spamC for bayes database - uses regex
$SpamFolders = 'Spam|Junk' # Spam folders to feed messages to spamC for bayes database - uses regex
$BayesDays = 7 # Number of days worth of spam/ham to feed to bayes
$SADir = 'C:\Program Files\JAM Software\SpamAssassin for Windows' # SpamAssassin Install Directory
$BayesBackupLocation = "X:\sa-learn\.spamassassin\bayes_backup" # Bayes backup folder
<### START SCRIPT ###>
<# Functions copied from hMailServer Backup required for testing #>
Function Debug ($DebugOutput) {Write-Host $DebugOutput}
Function Email ($DebugOutput) {}
Function ElapsedTime ($EndTime) {
$TimeSpan = New-Timespan $EndTime
If (([int]($TimeSpan).Hours) -eq 0) {$Hours = ""} ElseIf (([int]($TimeSpan).Hours) -eq 1) {$Hours = "1 hour "} Else {$Hours = "$([int]($TimeSpan).Hours) hours "}
If (([int]($TimeSpan).Minutes) -eq 0) {$Minutes = ""} ElseIf (([int]($TimeSpan).Minutes) -eq 1) {$Minutes = "1 minute "} Else {$Minutes = "$([int]($TimeSpan).Minutes) minutes "}
If (([int]($TimeSpan).Seconds) -eq 1) {$Seconds = "1 second"} Else {$Seconds = "$([int]($TimeSpan).Seconds) seconds"}
If (($TimeSpan).TotalSeconds -lt 1) {
$Return = "less than 1 second"
} Else {
$Return = "$Hours$Minutes$Seconds"
}
Return $Return
}
Function Plural ($Integer) {
If ($Integer -eq 1) {$S = ""} Else {$S = "s"}
Return $S
}
<# Set Bayes variables #>
Set-Variable -Name TotalHamFedMessages -Value 0 -Option AllScope
Set-Variable -Name TotalSpamFedMessages -Value 0 -Option AllScope
Set-Variable -Name HamFedMessageErrors -Value 0 -Option AllScope
Set-Variable -Name SpamFedMessageErrors -Value 0 -Option AllScope
Set-Variable -Name LearnedHamMessages -Value 0 -Option AllScope
Set-Variable -Name LearnedSpamMessages -Value 0 -Option AllScope
Function GetBayesSubFolders ($Folder) {
$IterateFolder = 0
$ArrayBayesMessages = @()
If ($Folder.SubFolders.Count -gt 0) {
Do {
$SubFolder = $Folder.SubFolders.Item($IterateFolder)
$SubFolderName = $SubFolder.Name
$SubFolderID = $SubFolder.ID
If ($SubFolder.Subfolders.Count -gt 0) {GetBayesSubFolders $SubFolder}
If ($SubFolder.Messages.Count -gt 0) {
If ($BayesSubFolders) {GetBayesMessages $SubFolder}
}
$IterateFolder++
} Until ($IterateFolder -eq $Folder.SubFolders.Count)
}
$ArrayBayesMessages.Clear()
}
Function GetBayesMatchFolders ($Folder) {
$IterateFolder = 0
If ($Folder.SubFolders.Count -gt 0) {
Do {
$SubFolder = $Folder.SubFolders.Item($IterateFolder)
$SubFolderName = $SubFolder.Name
If (($SubFolderName -match $HamFolders) -or ($SubFolderName -match $SpamFolders)) {
GetBayesSubFolders $SubFolder
GetBayesMessages $SubFolder
} Else {
GetBayesMatchFolders $SubFolder
}
$IterateFolder++
} Until ($IterateFolder -eq $Folder.SubFolders.Count)
}
}
Function GetBayesMessages ($Folder) {
$IterateMessage = 0
$ArrayHamToFeed = @()
$ArraySpamToFeed = @()
$HamFedMessages = 0
$SpamFedMessages = 0
$LearnedHamMessagesFolder = 0
$LearnedSpamMessagesFolder = 0
$FolderName = $Folder.Name
If ($Folder.Messages.Count -gt 0) {
If ($Folder.Name -match $HamFolders) {
Do {
$Message = $Folder.Messages.Item($IterateMessage)
If ($Message.InternalDate -gt ((Get-Date).AddDays(-$BayesDays))) {
$ArrayHamToFeed += $Message.FileName
}
$IterateMessage++
} Until ($IterateMessage -eq $Folder.Messages.Count)
}
If ($Folder.Name -match $SpamFolders) {
Do {
$Message = $Folder.Messages.Item($IterateMessage)
If ($Message.InternalDate -gt ((Get-Date).AddDays(-$BayesDays))) {
$ArraySpamToFeed += $Message.FileName
}
$IterateMessage++
} Until ($IterateMessage -eq $Folder.Messages.Count)
}
}
$ArrayHamToFeed | ForEach {
$FileName = $_
Try {
If ((Get-Item $FileName).Length -lt 512000) {
If ($DoSpamC) {
$SpamC = & cmd /c "`"$SADir\spamc.exe`" -d `"$SAHost`" -p `"$SAPort`" -L ham < `"$FileName`""
$SpamCResult = Out-String -InputObject $SpamC
If ($SpamCResult -match "Message successfully un/learned") {
$LearnedHamMessages++
$LearnedHamMessagesFolder++
}
If (($SpamCResult -notmatch "Message successfully un/learned") -and ($SpamCResult -notmatch "Message was already un/learned")) {
Throw $SpamCResult
}
}
$HamFedMessages++
$TotalHamFedMessages++
}
}
Catch {
$HamFedMessageErrors++
$Err = $Error[0]
Debug "[ERROR] Feeding HAM message $FileName in $AccountAddress"
Debug "[ERROR] $Err"
}
}
$ArraySpamToFeed | ForEach {
$FileName = $_
Try {
If ((Get-Item $FileName).Length -lt 512000) {
If ($DoSpamC) {
$SpamC = & cmd /c "`"$SADir\spamc.exe`" -d `"$SAHost`" -p `"$SAPort`" -L spam < `"$FileName`""
$SpamCResult = Out-String -InputObject $SpamC
If ($SpamCResult -match "Message successfully un/learned") {
$LearnedSpamMessages++
$LearnedSpamMessagesFolder++
}
If (($SpamCResult -notmatch "Message successfully un/learned") -and ($SpamCResult -notmatch "Message was already un/learned")) {
Throw $SpamCResult
}
}
$SpamFedMessages++
$TotalSpamFedMessages++
}
}
Catch {
$SpamFed0MessageErrors++
$Err = $Error[0]
Debug "[ERROR] Feeding SPAM message $FileName in $AccountAddress"
Debug "[ERROR] $Err"
}
}
If ($HamFedMessages -gt 0) {
Debug "Learned tokens from $LearnedHamMessagesFolder of $HamFedMessages HAM message$(Plural $HamFedMessages) fed from $FolderName in $AccountAddress"
}
If ($SpamFedMessages -gt 0) {
Debug "Learned tokens from $LearnedSpamMessagesFolder of $SpamFedMessages SPAM message$(Plural $SpamFedMessages) fed from $FolderName in $AccountAddress"
}
$ArraySpamToFeed.Clear()
}
Function FeedBayes {
$Error.Clear()
$BeginFeedingBayes = Get-Date
Debug "----------------------------"
Debug "Begin learning Bayes tokens from messages newer than $BayesDays days"
If (-not($DoSpamC)) {
Debug "SpamC disabled - Test Run ONLY"
}
<# Authenticate hMailServer COM #>
$hMS = New-Object -COMObject hMailServer.Application
$hMS.Authenticate("Administrator", $hMSAdminPass) | Out-Null
$SAHost = $hMS.Settings.AntiSpam.SpamAssassinHost
$SAPort = $hMS.Settings.AntiSpam.SpamAssassinPort
$IterateDomains = 0
Do {
$hMSDomain = $hMS.Domains.Item($IterateDomains)
If ($hMSDomain.Active) {
$IterateAccounts = 0
Do {
$hMSAccount = $hMSDomain.Accounts.Item($IterateAccounts)
If ($hMSAccount.Active) {
$AccountAddress = $hMSAccount.Address
$IterateIMAPFolders = 0
If ($hMSAccount.IMAPFolders.Count -gt 0) {
Do {
$hMSIMAPFolder = $hMSAccount.IMAPFolders.Item($IterateIMAPFolders)
If (($hMSIMAPFolder.Name -match $HamFolders) -or ($hMSIMAPFolder.Name -match $SpamFolders)) {
If ($hMSIMAPFolder.SubFolders.Count -gt 0) {
GetBayesSubFolders $hMSIMAPFolder
} # IF SUBFOLDER COUNT > 0
GetBayesMessages $hMSIMAPFolder
} # IF FOLDERNAME MATCH REGEX
Else {
GetBayesMatchFolders $hMSIMAPFolder
} # IF NOT FOLDERNAME MATCH REGEX
$IterateIMAPFolders++
} Until ($IterateIMAPFolders -eq $hMSAccount.IMAPFolders.Count)
} # IF IMAPFOLDER COUNT > 0
} #IF ACCOUNT ACTIVE
$IterateAccounts++
} Until ($IterateAccounts -eq $hMSDomain.Accounts.Count)
} # IF DOMAIN ACTIVE
$IterateDomains++
} Until ($IterateDomains -eq $hMS.Domains.Count)
Debug "----------------------------"
Debug "Finished feeding $($TotalHamFedMessages + $TotalSpamFedMessages) messages to Bayes in $(ElapsedTime $BeginFeedingBayes)"
Debug "----------------------------"
If ($HamFedMessageErrors -gt 0) {
Debug "Errors feeding HAM to SpamC : $HamFedMessageErrors Error$(Plural $HamFedMessageErrors) present"
Email "[ERROR] HAM SpamC : $HamFedMessageErrors Errors present : Check debug log"
} Else {
If ($TotalHamFedMessages -gt 0) {
Debug "Bayes learned from $LearnedHamMessages of $TotalHamFedMessages HAM message$(Plural $TotalHamFedMessages) found"
Email "[OK] Bayes HAM learn from $LearnedHamMessages of $TotalHamFedMessages message$(Plural $TotalHamFedMessages)"
} Else {
Debug "No HAM messages older than $BayesDays days to feed to Bayes"
Email "[OK] No HAM messages older than $BayesDays days to feed to Bayes"
}
}
If ($SpamFedMessageErrors -gt 0) {
Debug "Errors feeding SPAM to SpamC : $SpamFedMessageErrors Error$(Plural $SpamFedMessageErrors) present"
Email "[ERROR] SPAM SpamC : $SpamFedMessageErrors Errors present : Check debug log"
} Else {
If ($TotalSpamFedMessages -gt 0) {
Debug "Bayes learned from $LearnedSpamMessages of $TotalSpamFedMessages SPAM message$(Plural $TotalSpamFedMessages) found"
Email "[OK] Bayes SPAM learn from $LearnedSpamMessages of $TotalSpamFedMessages message$(Plural $TotalSpamFedMessages)"
} Else {
Debug "No SPAM messages older than $BayesDays days to feed to Bayes"
Email "[OK] No SPAM messages older than $BayesDays days to feed to Bayes"
}
}
Debug "----------------------------"
Try {
& cmd /c "`"$SADir\sa-learn.exe`" --backup > `"$BayesBackupLocation`"" -ErrorAction Stop
Debug "Successfully backed up Bayes database"
}
Catch {
$Err = $Error[0]
Debug "[ERROR] backing up Bayes : $Err"
}
}
FeedBayes
Re: SpamAssassin Bootcamp (sa-learn) train BAYES
I made some changes RvdH suggested in the "delete older than N days" thread. The two use the same com routine for finding messages. I also figured out a way to successfully trap errors sa-learn --backup.
Code: Select all
<#
.SYNOPSIS
Feed Bayes Database
.DESCRIPTION
Feeds messages to SPAMC for Bayes spam/ham learning
.FUNCTIONALITY
Looks for folder name match at any folder level and if found, feeds messages to spamc for learning
.PARAMETER
.NOTES
Add "--allow-tell" argument to your SPAMD service to allow SPAMC to report SPAM/HAM
.EXAMPLE
#>
<### USER VARIABLES ###>
$hMSAdminPass = 'secretpassword' # hMailServer Admin password
$DoSpamC = $True # FOR TESTING - set to false to run and report results without feeding SpamC with spam/ham
$BayesSubFolders = $True # True will feed messages from regex name matching subfolders
$HamFolders = 'Inbox|Ham' # Ham folders to feed messages to spamC for bayes database - uses regex
$SpamFolders = 'Spam|Junk' # Spam folders to feed messages to spamC for bayes database - uses regex
$BayesDays = 7 # Number of days worth of spam/ham to feed to bayes
$SADir = 'C:\Program Files\JAM Software\SpamAssassin for Windows' # SpamAssassin Install Directory
$BayesBackupLocation = "X:\sa-learn\.spamassassin\bayes_backup" # Bayes backup FILE
<### START SCRIPT ###>
<# Functions copied from hMailServer Backup required for testing #>
Function Debug ($DebugOutput) {Write-Host $DebugOutput}
Function Email ($DebugOutput) {}
Function ElapsedTime ($EndTime) {
$TimeSpan = New-Timespan $EndTime
If (([int]($TimeSpan).Hours) -eq 0) {$Hours = ""} ElseIf (([int]($TimeSpan).Hours) -eq 1) {$Hours = "1 hour "} Else {$Hours = "$([int]($TimeSpan).Hours) hours "}
If (([int]($TimeSpan).Minutes) -eq 0) {$Minutes = ""} ElseIf (([int]($TimeSpan).Minutes) -eq 1) {$Minutes = "1 minute "} Else {$Minutes = "$([int]($TimeSpan).Minutes) minutes "}
If (([int]($TimeSpan).Seconds) -eq 1) {$Seconds = "1 second"} Else {$Seconds = "$([int]($TimeSpan).Seconds) seconds"}
If (($TimeSpan).TotalSeconds -lt 1) {
$Return = "less than 1 second"
} Else {
$Return = "$Hours$Minutes$Seconds"
}
Return $Return
}
Function Plural ($Integer) {
If ($Integer -eq 1) {$S = ""} Else {$S = "s"}
Return $S
}
<# Set Bayes variables #>
Set-Variable -Name TotalHamFedMessages -Value 0 -Option AllScope
Set-Variable -Name TotalSpamFedMessages -Value 0 -Option AllScope
Set-Variable -Name HamFedMessageErrors -Value 0 -Option AllScope
Set-Variable -Name SpamFedMessageErrors -Value 0 -Option AllScope
Set-Variable -Name LearnedHamMessages -Value 0 -Option AllScope
Set-Variable -Name LearnedSpamMessages -Value 0 -Option AllScope
Function GetBayesSubFolders ($Folder) {
$IterateFolder = 0
$ArrayBayesMessages = @()
If ($Folder.SubFolders.Count -gt 0) {
Do {
$SubFolder = $Folder.SubFolders.Item($IterateFolder)
$SubFolderName = $SubFolder.Name
$SubFolderID = $SubFolder.ID
If ($SubFolder.Subfolders.Count -gt 0) {GetBayesSubFolders $SubFolder}
If ($SubFolder.Messages.Count -gt 0) {
If ($BayesSubFolders) {GetBayesMessages $SubFolder}
}
$IterateFolder++
} Until ($IterateFolder -eq $Folder.SubFolders.Count)
}
$ArrayBayesMessages.Clear()
}
Function GetBayesMatchFolders ($Folder) {
$IterateFolder = 0
If ($Folder.SubFolders.Count -gt 0) {
Do {
$SubFolder = $Folder.SubFolders.Item($IterateFolder)
$SubFolderName = $SubFolder.Name
If (($SubFolderName -match $HamFolders) -or ($SubFolderName -match $SpamFolders)) {
GetBayesSubFolders $SubFolder
GetBayesMessages $SubFolder
} Else {
GetBayesMatchFolders $SubFolder
}
$IterateFolder++
} Until ($IterateFolder -eq $Folder.SubFolders.Count)
}
}
Function GetBayesMessages ($Folder) {
$IterateMessage = 0
$ArrayHamToFeed = @()
$ArraySpamToFeed = @()
$HamFedMessages = 0
$SpamFedMessages = 0
$LearnedHamMessagesFolder = 0
$LearnedSpamMessagesFolder = 0
$FolderName = $Folder.Name
If ($Folder.Messages.Count -gt 0) {
If ($Folder.Name -match $HamFolders) {
Do {
$Message = $Folder.Messages.Item($IterateMessage)
If ($Message.InternalDate -gt ((Get-Date).AddDays(-$BayesDays))) {
$ArrayHamToFeed += $Message.FileName
}
$IterateMessage++
} Until ($IterateMessage -eq $Folder.Messages.Count)
}
If ($Folder.Name -match $SpamFolders) {
Do {
$Message = $Folder.Messages.Item($IterateMessage)
If ($Message.InternalDate -gt ((Get-Date).AddDays(-$BayesDays))) {
$ArraySpamToFeed += $Message.FileName
}
$IterateMessage++
} Until ($IterateMessage -eq $Folder.Messages.Count)
}
}
$ArrayHamToFeed | ForEach {
$FileName = $_
Try {
If ((Get-Item $FileName).Length -lt 512000) {
If ($DoSpamC) {
$SpamC = & cmd /c "`"$SADir\spamc.exe`" -d `"$SAHost`" -p `"$SAPort`" -L ham < `"$FileName`""
$SpamCResult = Out-String -InputObject $SpamC
If ($SpamCResult -match "Message successfully un/learned") {
$LearnedHamMessages++
$LearnedHamMessagesFolder++
}
If (($SpamCResult -notmatch "Message successfully un/learned") -and ($SpamCResult -notmatch "Message was already un/learned")) {
Throw $SpamCResult
}
}
$HamFedMessages++
$TotalHamFedMessages++
}
}
Catch {
$HamFedMessageErrors++
$Err = $Error[0]
Debug "[ERROR] Feeding HAM message $FileName in $AccountAddress"
Debug "[ERROR] $Err"
}
}
$ArraySpamToFeed | ForEach {
$FileName = $_
Try {
If ((Get-Item $FileName).Length -lt 512000) {
If ($DoSpamC) {
$SpamC = & cmd /c "`"$SADir\spamc.exe`" -d `"$SAHost`" -p `"$SAPort`" -L spam < `"$FileName`""
$SpamCResult = Out-String -InputObject $SpamC
If ($SpamCResult -match "Message successfully un/learned") {
$LearnedSpamMessages++
$LearnedSpamMessagesFolder++
}
If (($SpamCResult -notmatch "Message successfully un/learned") -and ($SpamCResult -notmatch "Message was already un/learned")) {
Throw $SpamCResult
}
}
$SpamFedMessages++
$TotalSpamFedMessages++
}
}
Catch {
$SpamFed0MessageErrors++
$Err = $Error[0]
Debug "[ERROR] Feeding SPAM message $FileName in $AccountAddress"
Debug "[ERROR] $Err"
}
}
If ($HamFedMessages -gt 0) {
Debug "Learned tokens from $LearnedHamMessagesFolder of $HamFedMessages HAM message$(Plural $HamFedMessages) fed from $FolderName in $AccountAddress"
}
If ($SpamFedMessages -gt 0) {
Debug "Learned tokens from $LearnedSpamMessagesFolder of $SpamFedMessages SPAM message$(Plural $SpamFedMessages) fed from $FolderName in $AccountAddress"
}
$ArraySpamToFeed.Clear()
}
Function FeedBayes {
$Error.Clear()
$BeginFeedingBayes = Get-Date
Debug "----------------------------"
Debug "Begin learning Bayes tokens from messages newer than $BayesDays days"
If (-not($DoSpamC)) {
Debug "SpamC disabled - Test Run ONLY"
}
<# Authenticate hMailServer COM #>
$hMS = New-Object -COMObject hMailServer.Application
$hMS.Authenticate("Administrator", $hMSAdminPass) | Out-Null
$SAHost = $hMS.Settings.AntiSpam.SpamAssassinHost
$SAPort = $hMS.Settings.AntiSpam.SpamAssassinPort
$IterateDomains = 0
If ($hMS.Domains.Count -gt 0) {
Do {
$hMSDomain = $hMS.Domains.Item($IterateDomains)
If ($hMSDomain.Active) {
$IterateAccounts = 0
If ($hMSDomain.Accounts.Count -gt 0) {
Do {
$hMSAccount = $hMSDomain.Accounts.Item($IterateAccounts)
If ($hMSAccount.Active) {
$AccountAddress = $hMSAccount.Address
$IterateIMAPFolders = 0
If ($hMSAccount.IMAPFolders.Count -gt 0) {
Do {
$hMSIMAPFolder = $hMSAccount.IMAPFolders.Item($IterateIMAPFolders)
If (($hMSIMAPFolder.Name -match $HamFolders) -or ($hMSIMAPFolder.Name -match $SpamFolders)) {
If ($hMSIMAPFolder.SubFolders.Count -gt 0) {
GetBayesSubFolders $hMSIMAPFolder
} # IF SUBFOLDER COUNT > 0
GetBayesMessages $hMSIMAPFolder
} # IF FOLDERNAME MATCH REGEX
Else {
GetBayesMatchFolders $hMSIMAPFolder
} # IF NOT FOLDERNAME MATCH REGEX
$IterateIMAPFolders++
} Until ($IterateIMAPFolders -eq $hMSAccount.IMAPFolders.Count)
} # IF IMAPFOLDER COUNT > 0
} #IF ACCOUNT ACTIVE
$IterateAccounts++
} Until ($IterateAccounts -eq $hMSDomain.Accounts.Count)
} # IF ACCOUNT COUNT > 0
} # IF DOMAIN ACTIVE
$IterateDomains++
} Until ($IterateDomains -eq $hMS.Domains.Count)
} # IF DOMAIN COUNT > 0
Debug "----------------------------"
Debug "Finished feeding $($TotalHamFedMessages + $TotalSpamFedMessages) messages to Bayes in $(ElapsedTime $BeginFeedingBayes)"
Debug "----------------------------"
If ($HamFedMessageErrors -gt 0) {
Debug "Errors feeding HAM to SpamC : $HamFedMessageErrors Error$(Plural $HamFedMessageErrors) present"
Email "[ERROR] HAM SpamC : $HamFedMessageErrors Errors present : Check debug log"
}
If ($TotalHamFedMessages -gt 0) {
Debug "Bayes learned from $LearnedHamMessages of $TotalHamFedMessages HAM message$(Plural $TotalHamFedMessages) found"
Email "[OK] Bayes HAM learn from $LearnedHamMessages of $TotalHamFedMessages message$(Plural $TotalHamFedMessages)"
} Else {
Debug "No HAM messages older than $BayesDays days to feed to Bayes"
Email "[OK] No HAM messages older than $BayesDays days to feed to Bayes"
}
If ($SpamFedMessageErrors -gt 0) {
Debug "Errors feeding SPAM to SpamC : $SpamFedMessageErrors Error$(Plural $SpamFedMessageErrors) present"
Email "[ERROR] SPAM SpamC : $SpamFedMessageErrors Errors present : Check debug log"
}
If ($TotalSpamFedMessages -gt 0) {
Debug "Bayes learned from $LearnedSpamMessages of $TotalSpamFedMessages SPAM message$(Plural $TotalSpamFedMessages) found"
Email "[OK] Bayes SPAM learn from $LearnedSpamMessages of $TotalSpamFedMessages message$(Plural $TotalSpamFedMessages)"
} Else {
Debug "No SPAM messages older than $BayesDays days to feed to Bayes"
Email "[OK] No SPAM messages older than $BayesDays days to feed to Bayes"
}
Debug "----------------------------"
Try {
& cmd /c "`"$SADir\sa-learn.exe`" --backup > `"$BayesBackupLocation`""
If ((Get-Item -Path $BayesBackupLocation).LastWriteTime -lt ((Get-Date).AddSeconds(-30))) {
Throw "Unknown Error backing up Bayes database"
}
Debug "Successfully backed up Bayes database"
}
Catch {
$Err = $Error[0]
Debug "[ERROR] backing up Bayes : $Err"
}
}
FeedBayes
Re: SpamAssassin Bootcamp (sa-learn) train BAYES
I guess nothing's ever done. PULL THE FORK! PULL THE FORK!!!
I realized I missed something - the ability to skip domains or user accounts if wanted.
I realized I missed something - the ability to skip domains or user accounts if wanted.
Code: Select all
<#
.SYNOPSIS
Feed Bayes Database
.DESCRIPTION
Feeds messages to SPAMC for Bayes spam/ham learning
.FUNCTIONALITY
Looks for folder name match at any folder level and if found, feeds messages to spamc for learning
.PARAMETER
.NOTES
Add "--allow-tell" argument to your SPAMD service to allow SPAMC to report SPAM/HAM
.EXAMPLE
#>
<### USER VARIABLES ###>
$hMSAdminPass = "secretpassword" # hMailServer Admin password
$DoSpamC = $False # FOR TESTING - set to false to run and report results without feeding SpamC with spam/ham
$BayesSubFolders = $True # True will feed messages from regex name matching subfolders
$HamFolders = "Inbox|Ham" # Ham folders to feed messages to spamC for bayes database - uses regex
$SpamFolders = "Spam|Junk" # Spam folders to feed messages to spamC for bayes database - uses regex
$SkipAccountBayes = "user@domain1.tld|spam@domain2.tld|dmarc@domain3.tld" # User accounts to skip - uses regex - If not used, leave blank (not "") or it will match EVERYTHING!
$SkipDomainBayes = "domain.tld" # Domains to skip - uses regex - If not used, leave blank (not "") or it will match EVERYTHING!
$BayesDays = 7 # Number of days worth of spam/ham to feed to bayes
$SADir = "C:\Program Files\JAM Software\SpamAssassin for Windows" # SpamAssassin Install Directory
$BayesBackupLocation = "X:\sa-learn\.spamassassin\bayes_backup" # Bayes backup file
<### START SCRIPT ###>
<# Functions copied from hMailServer Backup required for testing #>
Function Debug ($DebugOutput) {Write-Host $DebugOutput}
Function Email ($DebugOutput) {}
Function ElapsedTime ($EndTime) {
$TimeSpan = New-Timespan $EndTime
If (([int]($TimeSpan).Hours) -eq 0) {$Hours = ""} ElseIf (([int]($TimeSpan).Hours) -eq 1) {$Hours = "1 hour "} Else {$Hours = "$([int]($TimeSpan).Hours) hours "}
If (([int]($TimeSpan).Minutes) -eq 0) {$Minutes = ""} ElseIf (([int]($TimeSpan).Minutes) -eq 1) {$Minutes = "1 minute "} Else {$Minutes = "$([int]($TimeSpan).Minutes) minutes "}
If (([int]($TimeSpan).Seconds) -eq 1) {$Seconds = "1 second"} Else {$Seconds = "$([int]($TimeSpan).Seconds) seconds"}
If (($TimeSpan).TotalSeconds -lt 1) {
$Return = "less than 1 second"
} Else {
$Return = "$Hours$Minutes$Seconds"
}
Return $Return
}
Function Plural ($Integer) {
If ($Integer -eq 1) {$S = ""} Else {$S = "s"}
Return $S
}
<# Set Bayes variables #>
Set-Variable -Name TotalHamFedMessages -Value 0 -Option AllScope
Set-Variable -Name TotalSpamFedMessages -Value 0 -Option AllScope
Set-Variable -Name HamFedMessageErrors -Value 0 -Option AllScope
Set-Variable -Name SpamFedMessageErrors -Value 0 -Option AllScope
Set-Variable -Name LearnedHamMessages -Value 0 -Option AllScope
Set-Variable -Name LearnedSpamMessages -Value 0 -Option AllScope
Function GetBayesSubFolders ($Folder) {
$IterateFolder = 0
$ArrayBayesMessages = @()
If ($Folder.SubFolders.Count -gt 0) {
Do {
$SubFolder = $Folder.SubFolders.Item($IterateFolder)
$SubFolderName = $SubFolder.Name
$SubFolderID = $SubFolder.ID
If ($SubFolder.Subfolders.Count -gt 0) {GetBayesSubFolders $SubFolder}
If ($SubFolder.Messages.Count -gt 0) {
If ($BayesSubFolders) {GetBayesMessages $SubFolder}
}
$IterateFolder++
} Until ($IterateFolder -eq $Folder.SubFolders.Count)
}
$ArrayBayesMessages.Clear()
}
Function GetBayesMatchFolders ($Folder) {
$IterateFolder = 0
If ($Folder.SubFolders.Count -gt 0) {
Do {
$SubFolder = $Folder.SubFolders.Item($IterateFolder)
$SubFolderName = $SubFolder.Name
If (($SubFolderName -match $HamFolders) -or ($SubFolderName -match $SpamFolders)) {
GetBayesSubFolders $SubFolder
GetBayesMessages $SubFolder
} Else {
GetBayesMatchFolders $SubFolder
}
$IterateFolder++
} Until ($IterateFolder -eq $Folder.SubFolders.Count)
}
}
Function GetBayesMessages ($Folder) {
$IterateMessage = 0
$ArrayHamToFeed = @()
$ArraySpamToFeed = @()
$HamFedMessages = 0
$SpamFedMessages = 0
$LearnedHamMessagesFolder = 0
$LearnedSpamMessagesFolder = 0
If ($Folder.Messages.Count -gt 0) {
If ($Folder.Name -match $HamFolders) {
Do {
$Message = $Folder.Messages.Item($IterateMessage)
If ($Message.InternalDate -gt ((Get-Date).AddDays(-$BayesDays))) {
$ArrayHamToFeed += $Message.FileName
}
$IterateMessage++
} Until ($IterateMessage -eq $Folder.Messages.Count)
}
If ($Folder.Name -match $SpamFolders) {
Do {
$Message = $Folder.Messages.Item($IterateMessage)
If ($Message.InternalDate -gt ((Get-Date).AddDays(-$BayesDays))) {
$ArraySpamToFeed += $Message.FileName
}
$IterateMessage++
} Until ($IterateMessage -eq $Folder.Messages.Count)
}
}
$ArrayHamToFeed | ForEach {
$FileName = $_
Try {
If ((Get-Item $FileName).Length -lt 512000) {
If ($DoSpamC) {
$SpamC = & cmd /c "`"$SADir\spamc.exe`" -d `"$SAHost`" -p `"$SAPort`" -L ham < `"$FileName`""
$SpamCResult = Out-String -InputObject $SpamC
If ($SpamCResult -match "Message successfully un/learned") {
$LearnedHamMessages++
$LearnedHamMessagesFolder++
}
If (($SpamCResult -notmatch "Message successfully un/learned") -and ($SpamCResult -notmatch "Message was already un/learned")) {
Throw $SpamCResult
}
}
$HamFedMessages++
$TotalHamFedMessages++
}
}
Catch {
$HamFedMessageErrors++
$Err = $Error[0]
Debug "[ERROR] Feeding HAM message $FileName in $($hMSAccount.Address)"
Debug "[ERROR] $Err"
}
}
$ArraySpamToFeed | ForEach {
$FileName = $_
Try {
If ((Get-Item $FileName).Length -lt 512000) {
If ($DoSpamC) {
$SpamC = & cmd /c "`"$SADir\spamc.exe`" -d `"$SAHost`" -p `"$SAPort`" -L spam < `"$FileName`""
$SpamCResult = Out-String -InputObject $SpamC
If ($SpamCResult -match "Message successfully un/learned") {
$LearnedSpamMessages++
$LearnedSpamMessagesFolder++
}
If (($SpamCResult -notmatch "Message successfully un/learned") -and ($SpamCResult -notmatch "Message was already un/learned")) {
Throw $SpamCResult
}
}
$SpamFedMessages++
$TotalSpamFedMessages++
}
}
Catch {
$SpamFed0MessageErrors++
$Err = $Error[0]
Debug "[ERROR] Feeding SPAM message $FileName in $($hMSAccount.Address)"
Debug "[ERROR] $Err"
}
}
If ($HamFedMessages -gt 0) {
Debug "Learned tokens from $LearnedHamMessagesFolder of $HamFedMessages HAM message$(Plural $HamFedMessages) fed from $($Folder.Name) in $($hMSAccount.Address)"
}
If ($SpamFedMessages -gt 0) {
Debug "Learned tokens from $LearnedSpamMessagesFolder of $SpamFedMessages SPAM message$(Plural $SpamFedMessages) fed from $($Folder.Name) in $($hMSAccount.Address)"
}
$ArraySpamToFeed.Clear()
}
Function FeedBayes {
$Error.Clear()
$BeginFeedingBayes = Get-Date
Debug "----------------------------"
Debug "Begin learning Bayes tokens from messages newer than $BayesDays days"
If (-not($DoSpamC)) {
Debug "SpamC disabled - Test Run ONLY"
}
<# Authenticate hMailServer COM #>
$hMS = New-Object -COMObject hMailServer.Application
$hMS.Authenticate("Administrator", $hMSAdminPass) | Out-Null
$SAHost = $hMS.Settings.AntiSpam.SpamAssassinHost
$SAPort = $hMS.Settings.AntiSpam.SpamAssassinPort
$IterateDomains = 0
If ($hMS.Domains.Count -gt 0) {
Do {
$hMSDomain = $hMS.Domains.Item($IterateDomains)
If (($hMSDomain.Active) -and ($hMSDomain.Name -notmatch $SkipDomainBayes)) {
$IterateAccounts = 0
If ($hMSDomain.Accounts.Count -gt 0) {
Do {
$hMSAccount = $hMSDomain.Accounts.Item($IterateAccounts)
If (($hMSAccount.Active) -and ($hMSAccount.Address -notmatch $SkipAccountBayes)) {
$IterateIMAPFolders = 0
If ($hMSAccount.IMAPFolders.Count -gt 0) {
Do {
$hMSIMAPFolder = $hMSAccount.IMAPFolders.Item($IterateIMAPFolders)
If (($hMSIMAPFolder.Name -match $HamFolders) -or ($hMSIMAPFolder.Name -match $SpamFolders)) {
If ($hMSIMAPFolder.SubFolders.Count -gt 0) {
GetBayesSubFolders $hMSIMAPFolder
} # IF SUBFOLDER COUNT > 0
GetBayesMessages $hMSIMAPFolder
} # IF FOLDERNAME MATCH REGEX
Else {
GetBayesMatchFolders $hMSIMAPFolder
} # IF NOT FOLDERNAME MATCH REGEX
$IterateIMAPFolders++
} Until ($IterateIMAPFolders -eq $hMSAccount.IMAPFolders.Count)
} # IF IMAPFOLDER COUNT > 0
} #IF ACCOUNT ACTIVE
$IterateAccounts++
} Until ($IterateAccounts -eq $hMSDomain.Accounts.Count)
} # IF ACCOUNT COUNT > 0
} # IF DOMAIN ACTIVE
$IterateDomains++
} Until ($IterateDomains -eq $hMS.Domains.Count)
} # IF DOMAIN COUNT > 0
Debug "----------------------------"
Debug "Finished feeding $($TotalHamFedMessages + $TotalSpamFedMessages) messages to Bayes in $(ElapsedTime $BeginFeedingBayes)"
Debug "----------------------------"
If ($HamFedMessageErrors -gt 0) {
Debug "Errors feeding HAM to SpamC : $HamFedMessageErrors Error$(Plural $HamFedMessageErrors) present"
Email "[ERROR] HAM SpamC : $HamFedMessageErrors Errors present : Check debug log"
}
If ($TotalHamFedMessages -gt 0) {
Debug "Bayes learned from $LearnedHamMessages of $TotalHamFedMessages HAM message$(Plural $TotalHamFedMessages) found"
Email "[OK] Bayes HAM learn from $LearnedHamMessages of $TotalHamFedMessages message$(Plural $TotalHamFedMessages)"
} Else {
Debug "No HAM messages older than $BayesDays days to feed to Bayes"
Email "[OK] No HAM messages older than $BayesDays days to feed to Bayes"
}
If ($SpamFedMessageErrors -gt 0) {
Debug "Errors feeding SPAM to SpamC : $SpamFedMessageErrors Error$(Plural $SpamFedMessageErrors) present"
Email "[ERROR] SPAM SpamC : $SpamFedMessageErrors Errors present : Check debug log"
}
If ($TotalSpamFedMessages -gt 0) {
Debug "Bayes learned from $LearnedSpamMessages of $TotalSpamFedMessages SPAM message$(Plural $TotalSpamFedMessages) found"
Email "[OK] Bayes SPAM learn from $LearnedSpamMessages of $TotalSpamFedMessages message$(Plural $TotalSpamFedMessages)"
} Else {
Debug "No SPAM messages older than $BayesDays days to feed to Bayes"
Email "[OK] No SPAM messages older than $BayesDays days to feed to Bayes"
}
Debug "----------------------------"
Try {
& cmd /c "`"$SADir\sa-learn.exe`" --backup > `"$BayesBackupLocation`""
If ((Get-Item -Path $BayesBackupLocation).LastWriteTime -lt ((Get-Date).AddSeconds(-30))) {
Throw "Unknown Error backing up Bayes database"
}
Debug "Successfully backed up Bayes database"
}
Catch {
$Err = $Error[0]
Debug "[ERROR] backing up Bayes : $Err"
}
}
FeedBayes
Re: SpamAssassin Bootcamp (sa-learn) train BAYES
Well, you know what they say
SørenR.
Woke is Marxism advancing through Maoist cultural revolution.
Woke is Marxism advancing through Maoist cultural revolution.
Re: SpamAssassin Bootcamp (sa-learn) train BAYES
I'm having trouble wrapping my head around bayes_journal. Since I started using spamC, I've been using it exclusively for bayes learning. On a whim I synced bayes_journal and there were tokens to sync!
Not only that but it keeps growing. Yesterday it read 150 entries. I synced again and got no output, so I assume it only reports when there are entries to sync, and those previous 150 were synced into the database.
So now I'm more confused than ever. I added journal syncing to my script just because better safe than sorry, although at this point, I don't even know if it does anything at all.
So the question is - is journal syncing required with spamC or not?
Code: Select all
bayes: synced databases from journal in 0 seconds: 893 unique entries (900 total entries)
So now I'm more confused than ever. I added journal syncing to my script just because better safe than sorry, although at this point, I don't even know if it does anything at all.
Code: Select all
<#
.SYNOPSIS
Feed Bayes Database
.DESCRIPTION
Feeds messages to SPAMC for Bayes spam/ham learning
.FUNCTIONALITY
Looks for folder name match at any folder level and if found, feeds messages to spamc for learning
.PARAMETER
.NOTES
Add "--allow-tell" argument to your SPAMD service to allow SPAMC to report SPAM/HAM
.EXAMPLE
#>
<### USER VARIABLES ###>
$hMSAdminPass = "secretpassword" # hMailServer Admin password
$DoSpamC = $True # FOR TESTING - set to false to run and report results without feeding SpamC with spam/ham
$BayesSubFolders = $True # True will feed messages from regex name matching subfolders
$HamFolders = "Inbox|Ham" # Ham folders to feed messages to spamC for bayes database - uses regex
$SpamFolders = "Spam|Junk" # Spam folders to feed messages to spamC for bayes database - uses regex
$SkipAccountBayes = # User accounts to skip - uses regex - If not used, leave blank (not "") or it will match EVERYTHING!
$SkipDomainBayes = # Domains to skip - uses regex - If not used, leave blank (not "") or it will match EVERYTHING!
$BayesDays = 7 # Number of days worth of spam/ham to feed to bayes
$SADir = "C:\Program Files\JAM Software\SpamAssassin for Windows" # SpamAssassin Install Directory
$BayesBackupLocation = "X:\sa-learn\bayes\bayes_backup" # Bayes backup file
<### START SCRIPT ###>
<# Functions copied from hMailServer Backup required for testing #>
Function Debug ($DebugOutput) {Write-Host $DebugOutput}
Function Email ($DebugOutput) {}
Function ElapsedTime ($EndTime) {
$TimeSpan = New-Timespan $EndTime
If (([int]($TimeSpan).Hours) -eq 0) {$Hours = ""} ElseIf (([int]($TimeSpan).Hours) -eq 1) {$Hours = "1 hour "} Else {$Hours = "$([int]($TimeSpan).Hours) hours "}
If (([int]($TimeSpan).Minutes) -eq 0) {$Minutes = ""} ElseIf (([int]($TimeSpan).Minutes) -eq 1) {$Minutes = "1 minute "} Else {$Minutes = "$([int]($TimeSpan).Minutes) minutes "}
If (([int]($TimeSpan).Seconds) -eq 1) {$Seconds = "1 second"} Else {$Seconds = "$([int]($TimeSpan).Seconds) seconds"}
If (($TimeSpan).TotalSeconds -lt 1) {
$Return = "less than 1 second"
} Else {
$Return = "$Hours$Minutes$Seconds"
}
Return $Return
}
Function Plural ($Integer) {
If ($Integer -eq 1) {$S = ""} Else {$S = "s"}
Return $S
}
<# Set Bayes variables #>
Set-Variable -Name TotalHamFedMessages -Value 0 -Option AllScope
Set-Variable -Name TotalSpamFedMessages -Value 0 -Option AllScope
Set-Variable -Name HamFedMessageErrors -Value 0 -Option AllScope
Set-Variable -Name SpamFedMessageErrors -Value 0 -Option AllScope
Set-Variable -Name LearnedHamMessages -Value 0 -Option AllScope
Set-Variable -Name LearnedSpamMessages -Value 0 -Option AllScope
Function GetBayesSubFolders ($Folder) {
$IterateFolder = 0
$ArrayBayesMessages = @()
If ($Folder.SubFolders.Count -gt 0) {
Do {
$SubFolder = $Folder.SubFolders.Item($IterateFolder)
$SubFolderName = $SubFolder.Name
$SubFolderID = $SubFolder.ID
If ($SubFolder.Subfolders.Count -gt 0) {GetBayesSubFolders $SubFolder}
If ($SubFolder.Messages.Count -gt 0) {
If ($BayesSubFolders) {GetBayesMessages $SubFolder}
}
$IterateFolder++
} Until ($IterateFolder -eq $Folder.SubFolders.Count)
}
$ArrayBayesMessages.Clear()
}
Function GetBayesMatchFolders ($Folder) {
$IterateFolder = 0
If ($Folder.SubFolders.Count -gt 0) {
Do {
$SubFolder = $Folder.SubFolders.Item($IterateFolder)
$SubFolderName = $SubFolder.Name
If (($SubFolderName -match $HamFolders) -or ($SubFolderName -match $SpamFolders)) {
GetBayesSubFolders $SubFolder
GetBayesMessages $SubFolder
} Else {
GetBayesMatchFolders $SubFolder
}
$IterateFolder++
} Until ($IterateFolder -eq $Folder.SubFolders.Count)
}
}
Function GetBayesMessages ($Folder) {
$IterateMessage = 0
$ArrayHamToFeed = @()
$ArraySpamToFeed = @()
$HamFedMessages = 0
$SpamFedMessages = 0
$LearnedHamMessagesFolder = 0
$LearnedSpamMessagesFolder = 0
If ($Folder.Messages.Count -gt 0) {
If ($Folder.Name -match $HamFolders) {
Do {
$Message = $Folder.Messages.Item($IterateMessage)
If ($Message.InternalDate -gt ((Get-Date).AddDays(-$BayesDays))) {
$ArrayHamToFeed += $Message.FileName
}
$IterateMessage++
} Until ($IterateMessage -eq $Folder.Messages.Count)
}
If ($Folder.Name -match $SpamFolders) {
Do {
$Message = $Folder.Messages.Item($IterateMessage)
If ($Message.InternalDate -gt ((Get-Date).AddDays(-$BayesDays))) {
$ArraySpamToFeed += $Message.FileName
}
$IterateMessage++
} Until ($IterateMessage -eq $Folder.Messages.Count)
}
}
$ArrayHamToFeed | ForEach {
$FileName = $_
Try {
If ((Get-Item $FileName).Length -lt 512000) {
If ($DoSpamC) {
$SpamC = & cmd /c "`"$SADir\spamc.exe`" -d `"$SAHost`" -p `"$SAPort`" -L ham < `"$FileName`""
$SpamCResult = Out-String -InputObject $SpamC
If ($SpamCResult -match "Message successfully un/learned") {
$LearnedHamMessages++
$LearnedHamMessagesFolder++
}
If (($SpamCResult -notmatch "Message successfully un/learned") -and ($SpamCResult -notmatch "Message was already un/learned")) {
Throw $SpamCResult
}
}
$HamFedMessages++
$TotalHamFedMessages++
}
}
Catch {
$HamFedMessageErrors++
$Err = $Error[0]
Debug "[ERROR] Feeding HAM message $FileName in $($hMSAccount.Address)"
Debug "[ERROR] $Err"
}
}
$ArraySpamToFeed | ForEach {
$FileName = $_
Try {
If ((Get-Item $FileName).Length -lt 512000) {
If ($DoSpamC) {
$SpamC = & cmd /c "`"$SADir\spamc.exe`" -d `"$SAHost`" -p `"$SAPort`" -L spam < `"$FileName`""
$SpamCResult = Out-String -InputObject $SpamC
If ($SpamCResult -match "Message successfully un/learned") {
$LearnedSpamMessages++
$LearnedSpamMessagesFolder++
}
If (($SpamCResult -notmatch "Message successfully un/learned") -and ($SpamCResult -notmatch "Message was already un/learned")) {
Throw $SpamCResult
}
}
$SpamFedMessages++
$TotalSpamFedMessages++
}
}
Catch {
$SpamFed0MessageErrors++
$Err = $Error[0]
Debug "[ERROR] Feeding SPAM message $FileName in $($hMSAccount.Address)"
Debug "[ERROR] $Err"
}
}
If ($HamFedMessages -gt 0) {
Debug "Learned tokens from $LearnedHamMessagesFolder of $HamFedMessages HAM message$(Plural $HamFedMessages) fed from $($Folder.Name) in $($hMSAccount.Address)"
}
If ($SpamFedMessages -gt 0) {
Debug "Learned tokens from $LearnedSpamMessagesFolder of $SpamFedMessages SPAM message$(Plural $SpamFedMessages) fed from $($Folder.Name) in $($hMSAccount.Address)"
}
$ArraySpamToFeed.Clear()
}
Function FeedBayes {
$Error.Clear()
$BeginFeedingBayes = Get-Date
Debug "----------------------------"
Debug "Begin learning Bayes tokens from messages newer than $BayesDays days"
If (-not($DoSpamC)) {
Debug "SpamC disabled - Test Run ONLY"
}
<# Authenticate hMailServer COM #>
$hMS = New-Object -COMObject hMailServer.Application
$hMS.Authenticate("Administrator", $hMSAdminPass) | Out-Null
$SAHost = $hMS.Settings.AntiSpam.SpamAssassinHost
$SAPort = $hMS.Settings.AntiSpam.SpamAssassinPort
If ($hMS.Domains.Count -gt 0) {
$IterateDomains = 0
Do {
$hMSDomain = $hMS.Domains.Item($IterateDomains)
If (($hMSDomain.Active) -and ($hMSDomain.Name -notmatch $SkipDomainBayes) -and ($hMSDomain.Accounts.Count -gt 0)) {
$IterateAccounts = 0
Do {
$hMSAccount = $hMSDomain.Accounts.Item($IterateAccounts)
If (($hMSAccount.Active) -and ($hMSAccount.Address -notmatch $SkipAccountBayes) -and ($hMSAccount.IMAPFolders.Count -gt 0)) {
$IterateIMAPFolders = 0
Do {
$hMSIMAPFolder = $hMSAccount.IMAPFolders.Item($IterateIMAPFolders)
If (($hMSIMAPFolder.Name -match $HamFolders) -or ($hMSIMAPFolder.Name -match $SpamFolders)) {
If ($hMSIMAPFolder.SubFolders.Count -gt 0) {
GetBayesSubFolders $hMSIMAPFolder
}
GetBayesMessages $hMSIMAPFolder
}
Else {
GetBayesMatchFolders $hMSIMAPFolder
}
$IterateIMAPFolders++
} Until ($IterateIMAPFolders -eq $hMSAccount.IMAPFolders.Count)
}
$IterateAccounts++
} Until ($IterateAccounts -eq $hMSDomain.Accounts.Count)
}
$IterateDomains++
} Until ($IterateDomains -eq $hMS.Domains.Count)
}
Debug "----------------------------"
Debug "Finished feeding $($TotalHamFedMessages + $TotalSpamFedMessages) messages to Bayes in $(ElapsedTime $BeginFeedingBayes)"
Debug "----------------------------"
If ($HamFedMessageErrors -gt 0) {
Debug "Errors feeding HAM to SpamC : $HamFedMessageErrors Error$(Plural $HamFedMessageErrors) present"
Email "[ERROR] HAM SpamC : $HamFedMessageErrors Errors present : Check debug log"
}
If ($TotalHamFedMessages -gt 0) {
Debug "Bayes learned from $LearnedHamMessages of $TotalHamFedMessages HAM message$(Plural $TotalHamFedMessages) found"
Email "[OK] Bayes HAM learn from $LearnedHamMessages of $TotalHamFedMessages message$(Plural $TotalHamFedMessages)"
} Else {
Debug "No HAM messages older than $BayesDays days to feed to Bayes"
Email "[OK] No HAM messages older than $BayesDays days to feed to Bayes"
}
If ($SpamFedMessageErrors -gt 0) {
Debug "Errors feeding SPAM to SpamC : $SpamFedMessageErrors Error$(Plural $SpamFedMessageErrors) present"
Email "[ERROR] SPAM SpamC : $SpamFedMessageErrors Errors present : Check debug log"
}
If ($TotalSpamFedMessages -gt 0) {
Debug "Bayes learned from $LearnedSpamMessages of $TotalSpamFedMessages SPAM message$(Plural $TotalSpamFedMessages) found"
Email "[OK] Bayes SPAM learn from $LearnedSpamMessages of $TotalSpamFedMessages message$(Plural $TotalSpamFedMessages)"
} Else {
Debug "No SPAM messages older than $BayesDays days to feed to Bayes"
Email "[OK] No SPAM messages older than $BayesDays days to feed to Bayes"
}
Debug "----------------------------"
Try {
$BayesSync = & cmd /c "`"$SADir\sa-learn.exe`" --sync"
$BayesSyncResult = Out-String -InputObject $BayesSync
If ([string]::IsNullOrEmpty($BayesSyncResult)) {
Throw "Nothing to sync"
}
Debug $BayesSyncResult
}
Catch {
Debug "[ERROR] Bayes Journal Sync: $($Error[0])"
}
Debug "----------------------------"
Try {
If (-not(Test-Path $BayesBackupLocation)) {
Throw "Bayes backup file does not exist - Check Path"
} Else {
& cmd /c "`"$SADir\sa-learn.exe`" --backup > `"$BayesBackupLocation`""
If ((Get-Item -Path $BayesBackupLocation).LastWriteTime -lt ((Get-Date).AddSeconds(-30))) {
Throw "Unknown Error backing up Bayes database"
}
Debug "Successfully backed up Bayes database"
}
}
Catch {
Debug "[ERROR] backing up Bayes : $($Error[0])"
Email "[ERROR] backing up Bayes db"
}
}
FeedBayes
Re: SpamAssassin Bootcamp (sa-learn) train BAYES
http://spamassassin.1065346.n5.nabble.c ... 55419.html
David C. McCall wrote:
> DOH! I didn't include --sync in periodic sa-learn runs....
>
>
> slaps his forehead and returns into cave.
>
>
>
>
You shouldn't need --sync, unless you want to force the journal to be
synced and deleted when you run sa-learn. In general it will decide if
it needs to be synced on its own. It's also redundant if you have
--force-expire, as SA always syncs the journal prior to doing expiry.
In general, the journal disappearing is normal. It's just a "holding
tank" for atime updates (and tokens if you have learn to journal
enabled), It periodically gets dumped into the main database and deleted
during the expiry checks.
So, don't be concerned about it disappearing, that just means it's been
synced and hasn't been recreated by mail scanning.
SørenR.
Woke is Marxism advancing through Maoist cultural revolution.
Woke is Marxism advancing through Maoist cultural revolution.
Re: SpamAssassin Bootcamp (sa-learn) train BAYES
SorenR wrote: ↑2021-01-03 18:53http://spamassassin.1065346.n5.nabble.c ... 55419.html
David C. McCall wrote:
> DOH! I didn't include --sync in periodic sa-learn runs....
>
>
> slaps his forehead and returns into cave.
>
>
>
>
You shouldn't need --sync, unless you want to force the journal to be
synced and deleted when you run sa-learn. In general it will decide if
it needs to be synced on its own. It's also redundant if you have
--force-expire, as SA always syncs the journal prior to doing expiry.
In general, the journal disappearing is normal. It's just a "holding
tank" for atime updates (and tokens if you have learn to journal
enabled), It periodically gets dumped into the main database and deleted
during the expiry checks.
So, don't be concerned about it disappearing, that just means it's been
synced and hasn't been recreated by mail scanning.
OK, so if its optional, I suppose that also means that syncing can't hurt anything - at a minimum - or could be beneficial if done more often than whenever the automatic syncing occurs. "Don't be concerned" sounds a lot different than "DON'T PUSH THAT BIG RED BUTTON!!!!!"
I will leave it in my script for the time being, seeing as how I took the time to write it.
Re: SpamAssassin Bootcamp (sa-learn) train BAYES
... ahem... HowTo?
Re: SpamAssassin Bootcamp (sa-learn) train BAYES
https://hmailserver.com/forum/viewtopic ... 21&t=33566
Re: SpamAssassin Bootcamp (sa-learn) train BAYES
Please note that the LashBack list is dead and has been for quite some time.
SørenR.
Woke is Marxism advancing through Maoist cultural revolution.
Woke is Marxism advancing through Maoist cultural revolution.
Re: SpamAssassin Bootcamp (sa-learn) train BAYES
Ok... out of that post that's the script that trigger the learning of SA?
What have I do with it? That's not VBA, isn't it?
Re: SpamAssassin Bootcamp (sa-learn) train BAYES
Its the one you quoted. The script is in powershell.
Please note the following:
Code: Select all
.NOTES
Add "--allow-tell" argument to your SPAMD service to allow SPAMC to report SPAM/HAM
Re: SpamAssassin Bootcamp (sa-learn) train BAYES
Ok, NSSM I use also. Then I will give it a try, kill the actual SA- Service and set it again with --allow-tell. Hope that PS don't kick my ass again...
Re: SpamAssassin Bootcamp (sa-learn) train BAYES
PowerShell sucks since it can't do DCOM. My hMailServer is a server and as such it just sits there doing it's job. When I want to interact with it I use the Admin GUI, file share or DCOM. On rare occations I have to use Remote Desktop... That goes for BOTH my hMailServers.
SørenR.
Woke is Marxism advancing through Maoist cultural revolution.
Woke is Marxism advancing through Maoist cultural revolution.