Module Regexp::Irc
In: lib/rbot/core/utils/extends.rb
lib/rbot/irc.rb

We start with some IRC related regular expressions, used to match Irc::User nicks and users and Irc::Channel names

For each of them we define two versions of the regular expression:

  • a generic one, which should match for any server but may turn out to match more than a specific server would accept
  • an RFC-compliant matcher

Constants

CHAN_LIST = Regexp.new_list(GEN_CHAN)   Match a list of channel anmes separated by optional commas, whitespace and optionally the word "and"
IN_CHAN = /#{IN_ON}\s+(#{GEN_CHAN})|(here)|/   Match "in channel" or "on channel" and/or "in private" (optionally shortened to "in pvt"), returning the channel name or the word ‘private’ or ‘pvt’ as capture
IN_CHAN_PVT = /#{IN_CHAN}|in\s+(private|pvt)/
IN_CHAN_LIST_SFX = Regexp.new_list(/#{GEN_CHAN}|here/, IN_ON)   As above, but with channel lists
IN_CHAN_LIST = /#{IN_ON}\s+#{IN_CHAN_LIST_SFX}|anywhere|everywhere/
IN_CHAN_LIST_PVT_SFX = Regexp.new_list(/#{GEN_CHAN}|here|private|pvt/, IN_ON)
IN_CHAN_LIST_PVT = /#{IN_ON}\s+#{IN_CHAN_LIST_PVT_SFX}|anywhere|everywhere/
NICK_LIST = Regexp.new_list(GEN_NICK)   Match a list of nicknames separated by optional commas, whitespace and optionally the word "and"
CHAN_FIRST = /[#&+]/   Channel-name-matching regexps
CHAN_SAFE = /![A-Z0-9]{5}/
CHAN_ANY = /[^\x00\x07\x0A\x0D ,:]/
GEN_CHAN = /(?:#{CHAN_FIRST}|#{CHAN_SAFE})#{CHAN_ANY}+/
RFC_CHAN = /#{CHAN_FIRST}#{CHAN_ANY}{1,49}|#{CHAN_SAFE}#{CHAN_ANY}{1,44}/
SPECIAL_CHAR = /[\x5b-\x60\x7b-\x7d]/   Nick-matching regexps
NICK_FIRST = /#{SPECIAL_CHAR}|[[:alpha:]]/
NICK_ANY = /#{SPECIAL_CHAR}|[[:alnum:]]|-/
GEN_NICK = /#{NICK_FIRST}#{NICK_ANY}+/
RFC_NICK = /#{NICK_FIRST}#{NICK_ANY}{0,8}/
USER_CHAR = /[^\x00\x0a\x0d @]/
GEN_USER = /#{USER_CHAR}+/
HOSTNAME_COMPONENT = /[[:alnum:]](?:[[:alnum:]]|-)*[[:alnum:]]*/   Host-matching regexps
HOSTNAME = /#{HOSTNAME_COMPONENT}(?:\.#{HOSTNAME_COMPONENT})*/
HOSTADDR = /#{IP_ADDR}|#{IP6_ADDR}/
GEN_HOST = /#{HOSTNAME}|#{HOSTADDR}/
GEN_HOST_EXT = /\S+/   Sadly, different networks have different, RFC-breaking ways of cloaking the actualy host address: see above for an example to handle FreeNode. Another example would be Azzurra, wich also inserts a "=" in the cloacked host. So let‘s just not care about this and go with the simplest thing:
GEN_USER_ID = /(#{GEN_NICK})(?:(?:!(#{GEN_USER}))?@(#{GEN_HOST_EXT}))?/   User-matching Regexp
BANG_AT = /#{GEN_NICK}|\S+?(?:!\S+?)?@\S+?/   Things such has the BIP proxy send invalid nicks in a complete netmask, so we want to match this, rather: this matches either a compliant nick or a a string with a very generic nick, a very generic hostname after an @ sign, and an optional user after a !

[Validate]