<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom">
<channel>
<title>bentley.foo</title>
<link>https://bentley.foo</link>
<description>Blog posts from bentley.foo</description>
<language>en-us</language>
<atom:link href="https://bentley.foo/feed.xml" rel="self" type="application/rss+xml" />
<item>
<title>Books I've read in 2026</title>
<link>https://bentley.foo/books-ive-read-in-2026.html</link>
<guid>https://bentley.foo/books-ive-read-in-2026.html</guid>
<pubDate>Tue, 24 Feb 2026 00:00:00 -0000</pubDate>
<description>&lt;![CDATA[<ul>
<li><em>Shroud</em> — Adrian Tchaikovsky</li>
<li><em>Klara and the Sun</em> — Kazuo Ishiguro</li>
<li><em>Sun of Blood and Ruin</em> — Mariely Lares</li>
<li><em>The Wild Fox of Yemen</em> — Threa Almontaser</li>
<li><em>The Lathe of Heaven</em> — Ursula K. Le Guin</li>
<li><em>1491</em> — Charles C. Mann (in progress)</li>
</ul>]]&gt;</description>
</item>
<item>
<title>Hiring and growth for security research and response teams</title>
<link>https://bentley.foo/hiring-and-growth-for-security-research-and-response-teams.html</link>
<guid>https://bentley.foo/hiring-and-growth-for-security-research-and-response-teams.html</guid>
<pubDate>Tue, 30 Aug 2022 00:00:00 -0000</pubDate>
<description>&lt;![CDATA[<h1>Hiring</h1>
<p><img src="/images/20220830/marmot_soc_1.png" alt="marmot in a soc" style="max-width: 300px; width: 100%; height: auto;"></p>
<h2>Hire for Curiosity</h2>
<p>Expand your candidate list from experienced researchers to experienced engineers that have strong curiosity. Software engineers often have a background that the most experienced researchers do not; how are applications deployed at scale and how systems communicate. </p>
<p>These engineers know that critical credentials are stored in terraform state files and nuances, like instances in your private subnet may be able to communicate externally with a <a href="https://en.wikipedia.org/wiki/Botnet#Command_and_control">c2</a> over <a href="https://docs.aws.amazon.com/vpc/latest/userguide/egress-only-internet-gateway.html">IPv6</a> without a NAT gateway. Exposing software engineers to experienced researchers enables team discoveries that have wider coverage of allure and day in the life practicality. Joining forces is the way.</p>
<h2>Hire for Locale</h2>
<p>Some threats are obvious, blatant, and in those instances we can classify and move on. But you cannot apply a blanket of cultural norms to a customer base that crosses cultural boundaries. Data collection and monetization practices vary across cultural demarcation points. Your best bet at understanding these impacts and backgrounds is to hire from the locale. Specifically, if you are a US company analyzing applications or threat actors from LATAM and Asia, you need to have researchers from that region. Colloquialisms and slang simply do not translate via automated tools. Further, what is considered acceptable for data collection and monetization can vary wildly by region. It is imperative to understand the intent and the outcome before you classify. Otherwise, you risk false positives in your Serviceable Obtainable Market(SOM).</p>
<p><img src="/images/20220830/team_of_marmots.png" alt="team of marmots" style="max-width: 300px; width: 100%; height: auto;"></p>
<h1>Outputs</h1>
<p>Researchers need and deserve a technical outlet. This can be positively influenced with</p>
<ul>
<li>public facing technical blog</li>
<li>internal technical blog</li>
<li>regular internal documentation on threat actors and malware</li>
</ul>
<p>Product, research leads, and product marketing(PMM) can create a pipeline of technical and product relevant posts based on your internal documentation. Credit your researchers in public posts. Your sales engineering(SE) team will leverage internal technical blogs on their own. Promote SE self-service and prevent premature disclosure by clearly marking internal blogs with <a href="https://en.wikipedia.org/wiki/Traffic_Light_Protocol">traffic light protocol</a>(TLP). Incorporate TLP and the blogs into the SE onboarding process.</p>
<p>Creating a PMM gateway for all outputs is the worst outcome. I've brought this up at several organizations and heard several reasons why companies don't include their researchers as contributors or allow direct outputs. The worst reason to date being they don't want their researchers poached. The second worst was thinking that researchers couldn't do the writing.</p>
<p>Researchers writing will help you win over the customer personas and roles needed to land a new customer.</p>
<h2>Personas</h2>
<p>Economic Buyer: The person holding the purse strings. This person is the Wolf Blitzer problem. If Wolf Blitzer is talking about it, you need a market ready answer. PMM is a good resource for this. Everyone other persona and role is whispering in the ear of the economic buyer.</p>
<p>Technical expert / detractor: They are looking for technical outputs and solutions that solve day in the life issues for them. They may already have a solution in mind, which may not be you. This requires strong research led outputs and practical product solutions. PMM is unlikely to solve this. Research content that shows expertise and references product capabilities that solve day in the life issues is the key.</p>
<p>Champion: This is often a relationship managed directly by sales or sales engineering. Your research team is your hidden weapon. Feeding technical details via your internal blog to the SE team powers competitive knowledge as well as deeply technical conversations.</p>
<h1>Leveling</h1>
<p>Diversity of experience presents a leveling option that allows for different verticals of expertise.  When you create your leveling matrix, you can have the same levels as engineering. Senior Engineer, Staff, Principal, etc. Within those levels, you have to account for the verticals of expertise. Comparing your kernel hacker directly to your Kubernetes expert may not yield results that retain people, allow them to be fulfilled in their growth, and for you to help them plot a path to what they want to achieve. </p>
<p>Consider using the <a href="https://en.wikipedia.org/wiki/Four_stages_of_competence">Four Stages of Competence</a> throughout your leveling. This helps measure levels of competence within the vertical of expertises that they are achieving versus a specific comparisons.</p>
<h1>Chart a course</h1>
<p>Iteration based retrospectives are critical for security research teams to grow. You have to cover</p>
<ul>
<li>What went well</li>
<li>What could have gone better</li>
<li>What you want to try differently</li>
</ul>
<p>Have your team contribute their feedback ahead of the retrospective. This should be in written form via a shared medium like Confluence. Have your team rotate through roles in the retrospective for timekeeper and moderator. Set timelines for what they want to try differently, e.g, six weeks, and let them grow. Your role is to ensure they focus on processes and ideas and not people. A strong program manager can be a pivotal role in success for this process.</p>
<h1>Summary</h1>
<p><img src="/images/20220830/marmot_dc.png" alt="marmot in a dc" style="max-width: 300px; width: 100%; height: auto;"></p>
<p>Experienced researchers are critical for your team's success. Hiring for diversity of experience and locale will improve your odds of success.  Expansion of team and SOM needs software engineers and regional knowledge. Provide your team with the means to communicate and support with guardrails in the background. Enable growth and paths for success by measuring competence within verticals of expertise. </p>
<p>There is no blueprint for every team. Go forth, adapt, build, and grow.</p>
<h1>Appendix</h1>
<p>Many of the items in this post were sourced or influenced from research team retrospectives.</p>
<p>Images created using midjourney.com</p>
<hr />
<p>(c) Michael Bentley 2022</p>
<p>Contents may not be republished without written consent.</p>]]&gt;</description>
</item>
<item>
<title>Threat intel databases, part two</title>
<link>https://bentley.foo/threat-intel-databases-part-two.html</link>
<guid>https://bentley.foo/threat-intel-databases-part-two.html</guid>
<pubDate>Thu, 25 Aug 2022 00:00:00 -0000</pubDate>
<description>&lt;![CDATA[<p>This post continues from "<a href="https://marmot.studio/posts/20220822/">Threat intel databases, part one</a>". For simplicity, mentions of threat intel can be considered to include geolocation data.</p>
<h1>Threat Intel Acquisition</h1>
<p><img src="/images/20220825/marmot_lab_1.png" alt="" style="max-width: 300px; width: 100%; height: auto;"></p>
<p><strong>Day 0</strong></p>
<p>Flat files versus the world. Day 0, your focus should be flat files. Streaming and API-based feeds can wait. Flat files provide the most lift for the effort applied. This assumes that well-known sources such as abuse.ch, PAAS mappings<sup>1</sup>, and customer submitted threat intel / trusted entities are important. With flat files, all of your customers will be able to contribute a CSV of trusted IPs or malicious entities.</p>
<p><strong>Day 0+</strong></p>
<p>You are acquiring abuse.ch, AWS ip-json, all the other feeds that are table stakes. Your research team submissions are integrated with flat files. Now you are ready to move into commercial feeds. Once you transition to commercial feeds, you will encounter that most vendors want you to use an API-based model where you pay per entity (IP/Domain). Common reasons vendors push the API model.</p>
<ul>
<li>They cannot derive any product insights from a flat file model</li>
<li>Hard entitlement enforcement. Your only capability around 1:1 observations:queries is caching.</li>
<li>They do not have a production ready bulk file option</li>
</ul>
<p>For your vendor, the API-based model drives account management, product improvement and increasing spend.</p>
<p>There is a window where an API-based model is acceptable, post this window it no longer scales as a technology or a cost.</p>
<h2>API best practices</h2>
<p>Caching API responses can significantly improve application performance. Generally, domains and file hashes can be cached for long durations. IP addresses should not be cached for more than a few hours, or you risk false positives. Practically, your max cache duration should be aligned with your acquisition windows<sup>2</sup>. e.g., if you acquire every 12 hours, don't cache for 24. If you cache longer than your acquisition window, troubleshooting false positives will devolve into a negative QA experience executed by your research and response teams. </p>
<p>Using Redis keys with TTLs is a practical solution, refresh the TTLs when you process new threat intel and old data will automatically expire. Whatever caching implementation you choose, your research and response team will need access to it. Specifically, they will need to validate detections as well as purge false positives from the cache without engineering involvement in the incident response.</p>
<p>Do not cache your cache. No happiness comes from Dante's 9 circles of threat intel caching.</p>
<p>Common ways cyber-security companies discover they are over their commercial threat intel API limits
* Fail closed: the API integration is blocking for your application, you hit your API limit and trigger an outage.
* Fail open: the API integration is not blocking, you miss a table stakes detection and your customer escalates. This is a toxic false negative<sup>3</sup>.</p>
<p>Monitor your API call counts versus quota counts.</p>
<p><img src="/images/20220825/marmot_lab_2.png" alt="" style="max-width: 300px; width: 100%; height: auto;"></p>
<h2>Post API</h2>
<p>Bulk flat files are your friend. Flat files still need caching and research and response enablement. Major downsides to bulk flat files are 
* Many vendors have no idea how to price it. This will be apparent in your discovery calls with their sales team
* Cost, you are unlikely to encounter a vendor where  spend will go down transitioning from API to bulk
* Data quality, bulk flat files expose the data quality issues that are often less apparent with APIs</p>
<h1>Additional considerations</h1>
<p>All threat intel sources will have false positives; commercial, free, open-source, and your research team. False positives can range from an obvious and outright false positive, e.g., <code>1.2.3.4</code> is malware when it is not. Obvious false positives are easy to solve through improved processes and threat acquisition filtering capabilities. False positives that are derived from cultural differences or opinions will need to be handled via product enhancements accessible to your customer.</p>
<p>There is no standard format for threat intel, and there are extensive quality control issues. You will often see the same source use <code>null</code>, <code>"None"</code>, <code>"N/A"</code> values interchangeably. Timestamps can have wild variations.</p>
<p>File hashes can have prolific growth. For example, we once discovered a malicious Android app. For weeks after the initial discovery, we were acquiring 700k+ new hashes for the same malware every day. Prolific growth still fits in your Redis cache.</p>
<p>File size (bulk) can vary wildly from a few megabytes to hundreds of megabytes per feed per download.</p>
<h1>Threat intel and geolocation data persistence</h1>
<h2>Amazon Athena / S3</h2>
<p>Marmot acquires threat intel and persists it to S3. Post acquisition and validation data is written to two locations.</p>
<ul>
<li><code>latest</code></li>
<li><code>archive</code></li>
</ul>
<pre class="codehilite"><code>latest:  s3://acme-bucket/threat_intel/external/latest/abuse_ch/latest.jsonl.gz
archive: s3://acme-bucket/threat_intel/external/archive/abuse_ch/year=2022/month=08/day=24/&lt;relevant_filename&gt;.jsonl.gz
</code></pre>

<p>Latest and archive are available for searching via Athena. </p>
<p><strong>Latest</strong></p>
<p>Latest is your primary query source for any recent observations that occurred within your last acquisition window. This should cover the majority of your product's Athena-based queries.</p>
<p><strong>Archive</strong></p>
<p>Archive threat intel provides value for</p>
<ul>
<li>Evidence: A customer has a question on an event that is days to weeks old. The threat intel artifact persisted with the event does not contain enough evidence.</li>
<li>Research: Your team is working on a threat report or investigating events from a relevant date in your archive</li>
<li>Analytics: Statistical analysis of threat metadata to create new filters or derivative threat intel</li>
</ul>
<p>The S3 keys <code>year</code>, <code>month</code>, and <code>day</code> are Hive style <a href="https://docs.aws.amazon.com/athena/latest/ug/partitions.html">partitions</a> which can be queried as columns. For example:</p>
<pre class="codehilite"><code class="language-sql">SELECT * 
FROM abuse_ch 
WHERE &quot;year&quot; = 2022
 AND &quot;month&quot; = 5
 AND &quot;day&quot; = 5 
limit 10;
</code></pre>

<p>Partitions are a powerful feature that improves performance by limiting the amount of data crawled with a query. Many relevant variations are possible including partitioning by source names, customer UUIDs, etc.</p>
<h2>PostgreSQL</h2>
<p>Create a similar structure as S3 in PostgreSQL, where threat intel from the same source is not co-mingled across acquisition events.</p>
<p>DB considerations</p>
<ul>
<li>Avoid updates and deletes. Threat intel metadata can be highly ephemeral</li>
<li>Build your indexes in one shot </li>
<li>Consider truncating threat intel data before requiring <a href="https://wiki.postgresql.org/wiki/TOAST">TOAST</a></li>
</ul>
<p>These constraints, minus TOAST, push towards a design where a table is created for each acquisition event.</p>
<h3>Persistence process</h3>
<p>Post acquisition to S3, sanitization, and validation<sup>4</sup>:</p>
<ul>
<li>Create connection with <a href="https://www.psycopg.org/docs/usage.html#transactions-control">autocommit</a>=False</li>
<li>Create cursor <a href="https://www.psycopg.org/docs/usage.html#with-statement">with</a> context manager</li>
<li>Create database table with a unique name. ex: <code>abuse_ch_2022_05_05__06_00</code></li>
<li>Insert all rows</li>
<li>Commit</li>
<li>Create cursor with context manager</li>
<li>Add indexes </li>
<li>Commit</li>
<li>Create cursor with context manager </li>
<li>Add new row, referencing the new table, to table inventory table </li>
<li>Commit</li>
<li>Table is now accessible for new queries</li>
<li>Close connection<sup>5</sup></li>
</ul>
<p>How you use your context manager and when you perform commits is highly depending on your application structure. The two most impactful commits are
1. The commit after inserting all rows. Commits per row will slow down the process.
2. The commit to the inventory table. This makes the data accessible to the application.</p>
<p><strong>Table inventory table</strong></p>
<p>The inventory table helps with two items. Determining the latest table per source and determining which tables can be pruned.</p>
<p><strong>Determining the latest table</strong></p>
<p>Example query to identify all tables related to a threat intel source and return the lastest table.</p>
<pre class="codehilite"><code class="language-sql">SELECT
  table_name
FROM tmp_ti_table_inv
WHERE source_name = %(source_name)s
ORDER BY id DESC
LIMIT 1;
</code></pre>

<p><strong>Pruning tables</strong></p>
<p>Consider two tables per threat intel source to be a minimum requirement; the latest table plus the table you failed over from. Additional tables are helpful for quick QA. In the following query, number of tables per source is applied as <code>lowest_rank</code>. Setting <code>lowest_rank</code> to <code>5</code> would return all tables older than the most recent 5 tables for each threat intel source. The returned tables are the tables you prune.</p>
<pre class="codehilite"><code class="language-sql">SELECT
   id,
   src_table_name
FROM (
    SELECT tmp_ti_table_inv.*,
    rank() OVER (
        PARTITION BY source_name
        ORDER BY id DESC
    )
FROM tmp_ti_table_inv)
rf WHERE rank &gt; %(lowest_rank)s;
</code></pre>

<h2>Simplifying tables</h2>
<p>I use the same table structure for datasets that are similar. AWS and GCP ip-json ranges are a great example of this. The main benefit being query re-use across sources. This does, however, mean you will need to parameterize table names.</p>
<p>Psycopg2 provides <a href="https://www.psycopg.org/docs/sql.html#module-usage">functionality</a> table name parameterization.</p>
<p><code>src_n</code> and <code>src_tn</code> will be safely added into the query.</p>
<pre class="codehilite"><code class="language-python">if source_name in {'aws_ip_ranges', 'gcp_ip_ranges'}:
    select_q = sql.SQL(&quot;&quot;&quot;
    SELECT 
      props AS {src_n}
    FROM {src_tn}
    WHERE ip_prefix &gt;&gt; %(ip)s::inet;
    &quot;&quot;&quot;).format(src_n=sql.Identifier(source_name),
                src_tn=sql.Identifier(src_table_name))
</code></pre>

<h1>Summary</h1>
<p>Threat Intel is a fascinating adventure into data acquisition, sanitization, and filtering as well as presentation layers. It presents a broad degree of challenges. As a table stakes capability, missing these challenges incurs low efficacy, toxic false positives, and damages <a href="https://en.wikipedia.org/wiki/Net_promoter_score">NPS</a> scores. Meeting the challenges is a moving target that is fun and provides tangible value to your research teams, customers, and sales enablement. This post lightly touches on these challenges, and nothing here is gospel. Adapt this to your needs, be flexible, and most of all, have fun, enable your teams, and detect malicious things.</p>
<hr />
<h1>Appendix</h1>
<p><strong>References</strong></p>
<ol>
<li>a) https://docs.aws.amazon.com/general/latest/gr/aws-ip-ranges.html , b) https://cloud.google.com/compute/docs/faq#find_ip_range</li>
<li>Some threat intel sources offer full threat intel downloads and some offer updates only. Choices of update models has impacts on persistence, pruning, and performance. </li>
<li>Toxic False Negative: A false negative that can impact your sales pipeline, company brand, and product trust. Often associated with a table stakes detection that can not be explained away.</li>
<li>Do not trust external data. Always sanitize and validate contents.</li>
<li>Connection generation is costly. Closing connections is highly dependant on application structure. In some scenarios, such as connection proxies, it may not be needed. https://aws.amazon.com/rds/proxy/ </li>
</ol>
<p><strong>Notes</strong></p>
<ul>
<li>PostgreSQL interactions occur with Python3 and psycopg2</li>
<li>What about STIX and TAXII? I consider this a customer oriented feature, often not available for acquisition of external feeds. It should exist on your roadmap.</li>
<li>Images created using midjourney.com</li>
<li>Queries, S3 paths shown are updated to be more generic</li>
</ul>
<hr />
<p>(c) Michael Bentley 2022</p>
<p>Contents may not be republished without written consent.</p>]]&gt;</description>
</item>
<item>
<title>Threat intel databases, part one</title>
<link>https://bentley.foo/threat-intel-databases-part-one.html</link>
<guid>https://bentley.foo/threat-intel-databases-part-one.html</guid>
<pubDate>Mon, 22 Aug 2022 00:00:00 -0000</pubDate>
<description>&lt;![CDATA[<p><img src="/images/20220822/marmot_in_soc.png" alt="" style="max-width: 300px; width: 100%; height: auto;"></p>
<h1>Intro</h1>
<p>Three types of content I manage are threat intel, geolocation, and honeypot observations.</p>
<p><em>Threat Intel</em> is an opinion on an entity. Often that entity is a file hash, IP address, or domain that is associated with malware.</p>
<p><em>Geolocation</em> is location information associated with an IP address. For example, an IP associated with cloud providers like AWS and Alibaba, <a href="https://en.wikipedia.org/wiki/Autonomous_system_%28Internet%29">ASN</a>'s, or countries, states and cities. The main data differentiator of geolocation data from threat intel is how the data is queried.</p>
<p><em>Honeypot observations</em> are data associated with a connection to the Marmot<sup>1</sup> honeypot. This includes data such as source IP, ports, and payloads.</p>
<p><strong>Example datasets</strong></p>
<p><sub>Threat intel example from abuse.ch</sub></p>
<table>
<thead>
<tr>
<th>ioc_value</th>
<th>threat_type</th>
<th>malware_alias</th>
</tr>
</thead>
<tbody>
<tr>
<td><em>xx</em>.161.27.133:80</td>
<td>botnet_cc</td>
<td>Agentemis,BEACON,CobaltStrike,cobeacon</td>
</tr>
<tr>
<td><em>xx</em>.95.30.6:443</td>
<td>botnet_cc</td>
<td>Agentemis,BEACON,CobaltStrike,cobeacon</td>
</tr>
</tbody>
</table>
<p><sub>Geo ASN example from Ip2Location</sub></p>
<table>
<thead>
<tr>
<th>beginning_ip</th>
<th>ending_ip</th>
<th>cidr</th>
<th>asn</th>
<th>asn name</th>
</tr>
</thead>
<tbody>
<tr>
<td>16859136</td>
<td>16871423</td>
<td>1.1.96.0/20</td>
<td>2519</td>
<td>Arteria Networks Corporation</td>
</tr>
<tr>
<td>16871424</td>
<td>16873471</td>
<td>1.1.112.0/21</td>
<td>2519</td>
<td>Arteria Networks Corporation</td>
</tr>
</tbody>
</table>
<p><sub>Geo example from Ip2Location</sub></p>
<table>
<thead>
<tr>
<th>network_int</th>
<th>broadcast_int</th>
<th>iso_country</th>
<th>country</th>
<th>region</th>
</tr>
</thead>
<tbody>
<tr>
<td>16777216</td>
<td>16777471</td>
<td>US</td>
<td>United States of America</td>
<td>California</td>
</tr>
<tr>
<td>16777472</td>
<td>16778239</td>
<td>CN</td>
<td>China</td>
<td>Fujian</td>
</tr>
</tbody>
</table>
<p><sub>PAAS example from AWS</sub></p>
<table>
<thead>
<tr>
<th>ip_prefix</th>
<th>region</th>
<th>service</th>
<th>network_border_group</th>
</tr>
</thead>
<tbody>
<tr>
<td>3.2.34.0/26</td>
<td>af-south-1</td>
<td>AMAZON</td>
<td>af-south-1</td>
</tr>
<tr>
<td>3.5.140.0/22</td>
<td>ap-northeast-2</td>
<td>AMAZON</td>
<td>ap-northeast-2</td>
</tr>
</tbody>
</table>
<p>The purpose for acquiring threat intel, geolocation, and honeypot data is to ask questions of it. These questions can be for threat hunting or policy-based reasons. Asking questions of the datasets means matching on the following
* IP address equals IP address
* IP address in CIDR block
* IP integer<sup>2</sup> between network IP integer and broadcast IP integer</p>
<h1>Marmot Databases</h1>
<p>Questions at Marmot are primarily asked via</p>
<ol>
<li>Amazon Athena</li>
<li>PostgreSQL</li>
</ol>
<h2>Athena</h2>
<p>Athena is used for operational simplicity, cost, and threat hunting on medium-data<sup>3</sup> with limited sanitization. Athena was used for the honeypot <a href="https://marmot.studio/posts/20220707/">blog post</a> and was the only content database for the first iterations of Marmot. It continues to be used in parallel with PostgreSQL.</p>
<h2>PostgreSQL</h2>
<p>PostgreSQL<sup>4</sup> was introduced to help with 
1. rapid iteration on queries and table schemas for new features 
2. support highly repetitive queries 
3. flexibility over how IP addresses are queried</p>
<h3>Rapid iteration on queries and table schemas for new features</h3>
<p><em>Design considerations</em>: All threat intel, geolocation, and honeypot data is persisted to S3 on acquisition. Post acquisition, the data is ETL'd into a bucket for Athena or a table in PostgreSQL. The ETL job is not related to the pruning of acquired data. Once your first ETL scripts for Athena and PostgreSQL are configured in Terraform, additional prototype jobs become ~trivial.</p>
<p>When developing a new feature, I focus on what questions I want to ask before I dive into normalizing data or choose the database. Running queries in Athena and updating Athena schemas has an inherent latency that slows down my development flow. Compounding the latency, I sometimes get the answer to my question and realize I was asking the wrong question.</p>
<p>Once I have validated my questions, how often those questions will be asked, and other details, I choose Athena or PostgreSQL.</p>
<h3>Support highly repetitive queries</h3>
<p>For non-bulk data that will see query rates of several per minute, I lean towards PostgreSQL for cost and user-experience.</p>
<p><em>Cost</em>: Ultimately, I can control costs with API rate limiting. But in an MVP phase, I need to see how users want to use the product, not how they have to use it. With Athena, I may have to use stricter API limits for cost control, and that could be premature. </p>
<p><em>User-Experience</em>: Athena is asynchronous. Without a UI your users will have to run two to three commands per question. PostgreSQL can be implemented with synchronous behavior via your application for small responses<sup>5</sup>.</p>
<p>Asynchronous:
1. Ask the question
2. Is the answer ready
3. Download the response</p>
<p>Synchronous:
1. Ask the question, receive response.</p>
<p>For the right use case, Athena is very cost-effective versus an <a href="https://aws.amazon.com/rds/">RDS</a> for PostgreSQL instance. For queries that can
rely on indexes and caching, PostgreSQL can be cheaper and faster.</p>
<h3>Flexibility over how IP addresses are queried</h3>
<p>IP addresses are queried in Athena as integers. There is no <a href="https://docs.aws.amazon.com/athena/latest/ug/data-types.html">data-type</a> for IP address. Querying via integers occurs in databases that support IP addresses and CIDR notation as well.</p>
<h4>BETWEEN queries</h4>
<p><em>Design considerations</em>: Geolocation is not bound by CIDR blocks. It may include partial ranges. Consequently, you may always have a requirement to search IP ranges instead of CIDR ranges. </p>
<hr />
<p><strong>BETWEEN: JOIN, Integers</strong></p>
<p>Example of a many-to-many query using a JOIN, BETWEEN</p>
<p><em>Use-case</em>: Bulk preparation of location data for every IP address observed over a time range in the honeypot or customer dataset. Useful for prepping data for your <a href="https://en.wikipedia.org/wiki/Extract,_transform,_load">ETL</a>'s, reducing costs downstream and improving performance for customer queries. This query is applicable via Athena or PostgreSQL. This query gets expensive with IPv6<sup>6</sup>.</p>
<p><sub>Query requires that you have your IP addresses stored as integers (bigint, etcetera).</sub></p>
<pre class="codehilite"><code class="language-sql">SELECT
 hp.*,
 l4.country,
 l4.region
FROM honeypot hp
  INNER JOIN ip2location_ipv4 l4 ON hp.src_ip_int BETWEEN l4.begin_int AND l4.end_int;
</code></pre>

<hr />
<p><strong>BETWEEN: WHERE, Integers</strong></p>
<p>Example of a one-to-many query using WHERE, BETWEEN</p>
<p><em>Use-case</em>: Ad-hoc, user-driven queries. You will need to implement the conversion from IP to integer in your application for a positive user experience. This query is applicable via Athena or PostgreSQL.</p>
<p><sub>Query requires that you have your IP addresses stored as integers (bigint, etcetera).</sub></p>
<pre class="codehilite"><code class="language-sql">SELECT
  *
FROM ip2location_ipv4 l4
WHERE 16909060 BETWEEN l4.begin_int AND l4.end_int;
</code></pre>

<p><code>16909060</code> is equivalent to <code>1.2.3.4</code><sup>7</sup>. The user provided <code>1.2.3.4</code> in this example.</p>
<hr />
<p><strong>BETWEEN: WHERE, IP Address</strong></p>
<p>Example of a one-to-many query using WHERE, BETWEEN</p>
<p><em>Use-case</em>: Ad-hoc, user-driven queries where there is no CIDR block to query. Often associated with higher resolution geolocation queries. This case casts an IP address string to the <code>inet</code> data-type. <code>tmp.network_addr</code> and
<code>tmp.broadcast_addr</code> are already set to data-type <code>inet</code> in the schema. This query is applicable via PostgreSQL.</p>
<pre class="codehilite"><code class="language-sql">SELECT
  tmp.*
FROM ip2location_ipv4 tmp
WHERE '1.2.3.4'::inet BETWEEN tmp.network_addr AND tmp.broadcast_addr;
</code></pre>

<p>The user provided <code>1.2.3.4</code> in this example.</p>
<hr />
<p><strong>IP Addresses and CIDR blocks</strong></p>
<p>If your dataset has CIDR blocks, like AWS and GCP ranges, you can use the PostgreSQL <a href="https://www.postgresql.org/docs/14/functions-net.html">network operators</a>.</p>
<p><em>Use-case</em>: User-driven queries where the dataset has a CIDR block to query. This example casts the data-type for the input data as well as the dataset.</p>
<pre class="codehilite"><code class="language-sql">SELECT
    *
FROM aws_ranges aws
WHERE '1.2.3.4'::inet &lt;&lt; aws.ip_prefix::cidr;
</code></pre>

<p>The user provided <code>1.2.3.4</code> in this example.</p>
<h1>Summary</h1>
<p>Security is not one size fits all. Different features and different customers will have varying access patterns for data. It is important to have the ability to serve data via varying database types for the best user-experience and product efficacy. It is equally essential to support your engineers with the tooling they need to experiment and iterate quickly. Plan for any medium-data or larger dataset to need a map-reduce and relational database capabilities.</p>
<h1>Appendix</h1>
<p><strong>Remarks</strong></p>
<ol>
<li>Marmot: Ambiguous name for my security platform. </li>
<li>integer: A number without a fractional component. Not to be confused with a database or language data-type.</li>
<li>medium-data: Smaller than Big-Data, unless you have a marketing department.</li>
<li>PostgreSQL is the AWS RDS implementation for this post.</li>
<li>Pagination could be async. Recommend reading the <a href="https://use-the-index-luke.com/sql/partial-results/fetch-next-page">following</a> if you need pagination.</li>
<li>More data more problems. IPv6 Geo mappings are large in the number of rows and the size of the integers.</li>
<li><code>python
    from ipaddress import IPv4Address
    print(int(IPv4Address('1.2.3.4')))
    &gt;&gt;&gt; 16909060</code></li>
</ol>
<p>Image in post created using midjourney.com</p>
<p><strong>Databases</strong></p>
<p>I reference Athena and PostgreSQL exclusively in this doc. Other databases support these use-cases. Choose the database appropriate for your environment or infrastructure.</p>
<p><strong>Data sanitization</strong></p>
<p>Validate all user supplied data before submitting it to your database. Reject what is not expected.</p>
<p>Basic examples:
* Length limitations for integers
* Validate IP addresses with the <a href="https://docs.python.org/3/library/ipaddress.html">ipaddress library</a> or your language of choice's version.</p>
<p><a href="https://www.psycopg.org/docs/usage.html">Parameterize</a> all data submitted by your users to the database. There is no wiggle room on this.</p>
<hr />
<p>(c) Michael Bentley 2022</p>
<p>Contents may not be republished without written consent.</p>]]&gt;</description>
</item>
<item>
<title>Data brokers, spam messages, voicemail and Stan</title>
<link>https://bentley.foo/data-brokers-spam-messages-voicemail-and-stan.html</link>
<guid>https://bentley.foo/data-brokers-spam-messages-voicemail-and-stan.html</guid>
<pubDate>Thu, 11 Aug 2022 00:00:00 -0000</pubDate>
<description>&lt;![CDATA[<p>My cell phone receives what I consider to be an excessive amount of unsolicited text messages. 
Between January 1 and August 10, 2022, it received 76 unsolicited messages or 1 message every 2.9 days.</p>
<p><em>Number of unsolicited text messages per day since Jan 1, 2022</em></p>
<p><img src="/images/20220811/bc_1.png" alt="spam over time" style="max-width: 1000px; width: 100%; height: auto;"></p>
<h1>Types of messages and how I respond</h1>
<h2>Banking fraud</h2>
<p>When I receive a text message with a URL that is likely banking fraud, I do the following</p>
<ol>
<li>Run a whois to get the registrar’s abuse@ email address. </li>
<li>Take a screenshot of the message and send it to the abuse@ email with context to support the fraud claim.</li>
</ol>
<p>Registrars take this seriously and domain take-downs often happen within 24 hours. My time involved is
typically less than five minutes. This task can be accomplished while waiting for a Terraform deploy, or while 
someone is talking about the demise of snack selection during all-hands meetings.</p>
<p><img src="/images/20220811/abuse_at_1.png" alt="abuse email response" style="max-width: 500px; width: 100%; height: auto;"></p>
<p>Ad-hoc blocking the sending phone number for banking fraud has had little to no effect on repeat messages.
I suspect carriers quickly remediate the sources of activity out fairly quickly on their own, reducing any
impact I can achieve.</p>
<h2>Odd messages that are potentially phishing attempts</h2>
<p>I reply if I have time or someone is still going on about snacks. Responses from the sender are rare.
When I get a response it appears to be either nonsense or a case of mistaken identity/wrong number.</p>
<p><img src="/images/20220811/lewis_1.png" alt="phishing message for Lewis" style="max-width: 300px; width: 100%; height: auto;"></p>
<p><img src="/images/20220811/naomi_1.png" alt="phishing message from naomi" style="max-width: 300px; width: 100%; height: auto;"></p>
<h2>Real estate sale calls and messages</h2>
<p>For 2+ years, I have been receiving large numbers of voice calls and some text messages with cash offers to buy someone named
Stan's house. These calls and texts occur at inconvenient hours for me, something I attribute to
having a phone number that is not related to the timezone that I live or work in.</p>
<p>Initially, I used my phone settings to route all unknown calls to voicemail and silence unknown numbers for texts.
This is a less than desirable response as it also skips legitimate calls from unexpected numbers. For example, my doctor or calls
about an incident response in progress. Additionally, my voicemail alerts stopped once I reached 99 unread 
voicemails.</p>
<p>I'm not a fan of voicemail, but I'm less a fan of service degradation due to
* calls not intended for me
* the red dot notification indicating I have dozens of unread text messages</p>
<p>I decided to review the messages and start calling the realtors back. 
Reviewing transcribed voicemails helped me recognize that the callers were all using the same or similar sources of 
data. Each voicemail left detailed personal information about Stan, including his address.</p>
<p>My goal was to try
and find out where the realtors were sourcing their data from. The vast majority of realtors never answered my
calls or voice messages. But one realtor did. I explained that they were calling
at inappropriate hours, and that I was not their intended recipient. I queried their source of data. She did not know
the answer on the spot, but she did reach back out and stated that they source their material from Lexis Nexis.</p>
<p>I went to the Lexis Nexis site to search for a way to correct the information. The takeaway from my search was</p>
<ol>
<li>There are dozens of data brokers</li>
<li>There appears to be no way to correct when your information is associated with another persons account</li>
<li>I am not the only one with this issue. </li>
</ol>
<p>A DuckDuckGo search yields many references for this issue such as
* https://www.newsweek.com/2019/10/04/lexisnexis-mistake-data-insurance-costs-1460831.html
* https://old.reddit.com/r/legaladvice/comments/69mgbx/lexisnexis_has_incorrect_information_about_me_and/</p>
<p>I filled out the Lexis Nexis form to receive a copy of my data and opt out of their services that I never opted in to.
My report never arrived. I did get a confirmation that my data would be removed from my account. Removing my data has
zero impact to any data that they have already sold which has been subsequently re-shared into the data ether.</p>
<p>The real estate text messages and voicemails for Stan have not stopped.</p>
<p><img src="/images/20220811/stan_1.png" alt="real estate spam" style="max-width: 300px; width: 100%; height: auto;"></p>
<p><img src="/images/20220811/stan_2.png" alt="real estate spam" style="max-width: 300px; width: 100%; height: auto;"></p>
<h2>Political messages for Stan and Gayle</h2>
<p>By far, the largest number of messages I receive are politically oriented. Like the real estate messages
these do not come at a reasonable hour for where I live. I have replied to these messages with</p>
<ul>
<li>STOP and variations of</li>
<li>Attempting to call the number. This was ineffective due to the majority of messages being some type of API based service</li>
<li>Requesting them to stop in a frustrated tone</li>
</ul>
<p>Nothing has stemmed the influx of political text messages</p>
<h1>Analyzing Text Messages</h1>
<p>On August 10, 2022, I received a text message while on roaming and decided to see if I could take a similar approach
to the political messages as I do banking fraud. </p>
<p><img src="/images/20220811/messages/pr/pr_44.png" alt="" style="max-width: 300px; width: 100%; height: auto;"></p>
<p>These messages are not fraud, but I do expect them to violate
an acceptable use policy for a carrier since</p>
<ul>
<li>I can't opt out or stop the messages</li>
<li>I never opted in</li>
<li>I'm not the intended recipient</li>
<li>The excessive quantity</li>
</ul>
<p>To start my investigation I signed up for a Twilio account and used Twilio's <a href="https://console.twilio.com/us1/develop/lookup/lookup">phone lookup system</a>.
In this instance, the carrier for the message was identified by Twilio to be bandwidth.com.</p>
<p><em>Twilio Phone Lookup</em>
<img src="/images/20220811/twilio_bandwidth.png" alt="" style="max-width: 700px; width: 100%; height: auto;"></p>
<p>I browsed to the bandwidth.com site <insert shot> and filled in the form. Good news, they "are here to help".</p>
<p><em>Bandwidth.com Site</em>
<img src="/images/20220811/bandwidth_1.png" alt="" style="max-width: 500px; width: 100%; height: auto;"></p>
<p>Initially the bandwidth.com form would not work, citing incorrect form data. After several attempts with the same information 
the form allowed submission.
I received a response so fast I knew it would not be good news.</p>
<p><em>Bandwidth Response</em>
<img src="/images/20220811/bandwidth_2.png" alt="" style="max-width: 400px; width: 100%; height: auto;"></p>
<p>My interpretation of Bandwidth.com's response is
* they are a wholesale provider
* they are not responsible for how their network is used
* they would forward my complaint to an unknown entity</p>
<p>I replied that this was a problem of excess, cc'd their legal@email, and what I guessed to be their CEO's email.</p>
<p>Their legal team auto-responded to use the form that I had already filled out. I suppose they have received
these messages before, and the most logical response was to do nothing. I did not feel helped.</p>
<p><em>legal@bandwidth.com response</em>
<img src="/images/20220811/bandwidth_3.png" alt="" style="max-width: 800px; width: 100%; height: auto;"></p>
<p><em>Summary of my experience escalating to Bandwidth.com</em></p>
<p><img src="/images/20220811/bandwidth_abuse_workflow.png" alt="" style="max-width: 800px; width: 100%; height: auto;"></p>
<p>Bandwidth.com will not disclose the sub-carrier or service, thus blocking any attempts at me resolving at the carrier
level. Further, they push acceptable use entirely to their undisclosed customer.</p>
<p>This was a frustrating outcome. At this point I decided to analyze all unsolicited messages I received since 
Jan 1, 2022 and look for common ground.</p>
<h2>Methodology</h2>
<p><strong>Define spam</strong> </p>
<p>Any unsolicited text message. Messages that do not meet this requirement
* Personal messages
* SMS verifications for service sign-ups
* Automated messages for appointments</p>
<p><strong>Transcribe</strong></p>
<p>I did not find a way to copy the messages from my phone directly to my computer with OS provided tools.
I did not want to use a third-party tool to do this. Not deterred, I took the age-old accepted security 
response approach of spreadsheet-triage. I manually copied the date and source phone number for every 
unsolicited message into a spreadsheet. I labeled the messages with</p>
<ul>
<li>Classification category</li>
<li>Primary Type</li>
<li>Secondary Type</li>
<li>Domain - If there was a domain and what the domain was</li>
<li>Path - any URL text past the domain</li>
<li>Image - If an image was contained</li>
<li>Domain registrar</li>
<li>Mentions Stan</li>
<li>Mentions Gayle</li>
<li>Carrier</li>
</ul>
<p>The classifications are
* Spam - Greeting
* Spam - Survey
* Election Poll
* Fraud - Banking
* Britney Spears
* Real Estate
* Political</p>
<p>Primary Type and Secondary Type focus on the contents of the message. 
Carrier is the phone number carrier as reported by the Twilio.com phone number lookup service.</p>
<h2>Data points</h2>
<hr />
<p><strong>Classification Over Time</strong></p>
<p><img src="/images/20220811/classification_over_time.png" alt="" style="max-width: 1000px; width: 100%; height: auto;"></p>
<hr />
<p><strong>Classification To Carrier</strong></p>
<p><img src="/images/20220811/classification_to_carrier.png" alt="" style="max-width: 1000px; width: 100%; height: auto;"></p>
<ul>
<li>For classification to carrier I shortened the carrier name. For example <code>Bandwidth SMSEnabled - Bandwidth CLEC - Sybase365</code> was 
shortened to <code>Bandwidth.com</code>. The mapping is posted in the Appendix.</li>
<li><code>Null</code> and <code>Unknown</code> are distinct responses from the Twilio lookup service. Those responses are been preserved 
in the chart.</li>
</ul>
<p>Bandwidth.com and Telnyx cover the majority of political messages. </p>
<hr />
<p><strong>Carriers To Text Messagse With Domains</strong></p>
<ul>
<li>Text messages with no domain are indicated by <code>FALSE</code>.</li>
</ul>
<p><img src="/images/20220811/carrier_to_domain.png" alt="" style="max-width: 1000px; width: 100%; height: auto;"></p>
<hr />
<p><strong>Domain mapping</strong></p>
<p>The majority of political domains are fronts for hxxps://winred.com</p>
<ul>
<li><code>True</code>: Domain fronts for winred </li>
<li><code>False</code>: Domain does not front for winred</li>
<li>Text messages without domains have been filtered out</li>
</ul>
<p><img src="/images/20220811/domains_to_winred.png" alt="" style="max-width: 1000px; width: 100%; height: auto;"></p>
<p>I browsed to the Winred site to opt out of messages. Their site has a chatbot that provides categories
of questions, including text messages.</p>
<p><img src="/images/20220811/winred.png" alt="" style="max-width: 1000px; width: 100%; height: auto;"></p>
<p>Winred's response is to work with each campaign individually. The issues with this are
1. Volume - You attempting to mitigate a many-to-one attack
2. New sources - There is no source of truth, each campaign is a net new source
3. Stop and other replies via text message are partially ignored or ignored in totality.</p>
<p>Winred approach is similar to Bandwidth.com; in effect, there is no practical way to stop unsolicited messages 
via carriers or organizations using the carriers.</p>
<p>In every case, the parties involved in sending claim no responsibility or authority.</p>
<p><img src="/images/20220811/winred_opt_out.png" alt="" style="max-width: 1000px; width: 100%; height: auto;"></p>
<h1>Suggested industry requirements</h1>
<p>Note: I do not work in the carrier industry. </p>
<p><strong>Carrier identification for any text message</strong></p>
<p>A consumer should be able to identify the carrier or carrier customer account responsible for sending an API based or 
automated message. This should be trivial to accomplish via the message itself and any references via domains.</p>
<p>Identification via Twilio is not a consumer friendly option. </p>
<p><strong>Opt-out or block the carrier's customer account</strong></p>
<p>A consumer should be able to send a single STOP response and block all messages from the carrier
customer account. </p>
<p><strong>Mobile service providers should take a hostile approach</strong></p>
<p>Some API-based senders are not operating in good faith. Providers and MVNO's should take a hostile approach
towards senders. I am not a fan of my mobile phone provider choosing what I can and cannot see.
But this is service degradation at this point. Take the same approach you would for any other 
network-based attack and null route it.</p>
<p><strong>STOP should not convert to known sender</strong></p>
<p>Replying STOP to an unknown sender moves the customer to a known sender status on iPhones. This effectively
 disables the mitigation. Further, STOP should be standardized. Not Stop2End or Stop=End.</p>
<p><strong>The FTC should add a portal</strong></p>
<p>Currently, the FTC <a href="https://consumer.ftc.gov/articles/how-recognize-and-report-spam-text-messages#what_to_do">recommends</a> using your phone messaging settings to block this activity. This method does
not work in a many-to-one attack. The FTC should add a portal where complaints can be escalated, investigated,
and sources fined or shut down.</p>
<p><img src="/images/20220811/ftc.png" alt="" style="max-width: 1000px; width: 100%; height: auto;"></p>
<p><strong>Potential downsides of suggestions</strong></p>
<p>I suspect customers of API based carriers will continue to act in bad faith. Any identification capabilities are likely
to take the route of cookies where companies choose overtly obtrusive implementations rather than 
following the spirit of the regulation. However, text messages offer limited real estate and I suspect
that egregious implementations will have an equally negative impact to the sender as the receiver.</p>
<p>In summary - if you have a website or legal response that immediately acknowledges abuse on your platform
and how you are not responsible. You are operating in bad faith.</p>
<p><em>Let's go Bandwidth.com</em></p>
<p><img src="/images/20220811/lets_go_bw.png" alt="" style="max-width: 1000px; width: 100%; height: auto;"></p>
<p>Source image from <a href="https://en.wikipedia.org/wiki/Let%27s_Go_Brandon#/media/File:Let's_Go_Brandon_Florida_house.jpg">Wikipedia</a></p>
<h1>Appendix</h1>
<p>Tools used</p>
<ul>
<li>Graphs built using Amazon Quicksight</li>
<li>Keynote</li>
<li>Spreadsheets</li>
<li>Skitch</li>
<li>Twilio Phone Lookup</li>
</ul>
<h2>Screenshots of text messages</h2>
<p>I have truncated domains and other information that may be associated with my phone.</p>
<h4>How to read this data</h4>
<p>Messages are organized by categories</p>
<ul>
<li>Spam - Greeting</li>
<li>Spam - Survey</li>
<li>Election Poll</li>
<li>Fraud - Banking</li>
<li>Britney Spears</li>
<li>Real Estate</li>
<li>Political</li>
<li>Democratic</li>
<li>Republican</li>
</ul>
<p>Within categories messages are ordered left to right, above to below.
* Messages on the left arrived before messages on the right
* Messages on the above arrived before messages below</p>
<p>The number screenshots will exceed the number of messages in chats. 
I did not count images and multiple text messages sent at the same time as distinct messages.</p>
<hr />
<h3>Spam - Survey</h3>
<p><img src="/images/20220811/messages/sp/sp_1.png" alt="" style="max-width: 245px; width: 100%; height: auto;"></p>
<hr />
<h3>Spam - Greeting</h3>
<p><img src="/images/20220811/messages/sg/sg_1.png" alt="" style="max-width: 800px; width: 100%; height: auto;"></p>
<p><img src="/images/20220811/messages/sg/sg_2.png" alt="" style="max-width: 800px; width: 100%; height: auto;"></p>
<p><img src="/images/20220811/messages/sg/sg_3.png" alt="" style="max-width: 800px; width: 100%; height: auto;"></p>
<hr />
<h3>Election Poll</h3>
<p><img src="/images/20220811/messages/ep/ep_1.png" alt="" style="max-width: 245px; width: 100%; height: auto;"></p>
<hr />
<h3>Fraud - Banking</h3>
<p><img src="/images/20220811/messages/bf/bf_1.png" alt="" style="max-width: 800px; width: 100%; height: auto;"></p>
<hr />
<h3>Britney Spears</h3>
<p><img src="/images/20220811/messages/fb/fb_1.png" alt="" style="max-width: 245px; width: 100%; height: auto;"></p>
<hr />
<h3>Real Estate</h3>
<p><img src="/images/20220811/messages/re/re_1.png" alt="" style="max-width: 490px; width: 100%; height: auto;"></p>
<hr />
<h3>Political</h3>
<h4>Democratic</h4>
<p><img src="/images/20220811/messages/pd/pd_1.png" alt="" style="max-width: 490px; width: 100%; height: auto;"></p>
<p><img src="/images/20220811/messages/pd/pd_2.png" alt="" style="max-width: 245px; width: 100%; height: auto;"></p>
<p><img src="/images/20220811/messages/pd/pd_3.png" alt="" style="max-width: 245px; width: 100%; height: auto;"></p>
<h4>Republican</h4>
<p><img src="/images/20220811/messages/pr/pr_1.png" alt="" style="max-width: 245px; width: 100%; height: auto;"></p>
<p><img src="/images/20220811/messages/pr/pr_2.png" alt="" style="max-width: 490px; width: 100%; height: auto;"></p>
<p><img src="/images/20220811/messages/pr/pr_3a.png" alt="" style="max-width: 800px; width: 100%; height: auto;"></p>
<p><img src="/images/20220811/messages/pr/pr_3b.png" alt="" style="max-width: 245px; width: 100%; height: auto;"></p>
<p><img src="/images/20220811/messages/pr/pr_4.png" alt="" style="max-width: 245px; width: 100%; height: auto;"></p>
<p><img src="/images/20220811/messages/pr/pr_5.png" alt="" style="max-width: 800px; width: 100%; height: auto;"></p>
<p><img src="/images/20220811/messages/pr/pr_6.png" alt="" style="max-width: 490px; width: 100%; height: auto;"></p>
<p><img src="/images/20220811/messages/pr/pr_7.png" alt="" style="max-width: 490px; width: 100%; height: auto;"></p>
<p><img src="/images/20220811/messages/pr/pr_8.png" alt="" style="max-width: 245px; width: 100%; height: auto;"></p>
<p><img src="/images/20220811/messages/pr/pr_9.png" alt="" style="max-width: 245px; width: 100%; height: auto;"></p>
<p><img src="/images/20220811/messages/pr/pr_10.png" alt="" style="max-width: 245px; width: 100%; height: auto;"></p>
<p><img src="/images/20220811/messages/pr/pr_11.png" alt="" style="max-width: 490px; width: 100%; height: auto;"></p>
<p><img src="/images/20220811/messages/pr/pr_12.png" alt="" style="max-width: 245px; width: 100%; height: auto;"></p>
<p><img src="/images/20220811/messages/pr/pr_13.png" alt="" style="max-width: 800px; width: 100%; height: auto;"></p>
<p><img src="/images/20220811/messages/pr/pr_14.png" alt="" style="max-width: 245px; width: 100%; height: auto;"></p>
<p><img src="/images/20220811/messages/pr/pr_15.png" alt="" style="max-width: 245px; width: 100%; height: auto;"></p>
<p><img src="/images/20220811/messages/pr/pr_16.png" alt="" style="max-width: 490px; width: 100%; height: auto;"></p>
<p><img src="/images/20220811/messages/pr/pr_17.png" alt="" style="max-width: 800px; width: 100%; height: auto;"></p>
<p><img src="/images/20220811/messages/pr/pr_18.png" alt="" style="max-width: 490px; width: 100%; height: auto;"></p>
<p><img src="/images/20220811/messages/pr/pr_19.png" alt="" style="max-width: 245px; width: 100%; height: auto;"></p>
<p><img src="/images/20220811/messages/pr/pr_20a.png" alt="" style="max-width: 800px; width: 100%; height: auto;"></p>
<p><img src="/images/20220811/messages/pr/pr_20b.png" alt="" style="max-width: 250px; width: 100%; height: auto;"></p>
<p><img src="/images/20220811/messages/pr/pr_21.png" alt="" style="max-width: 800px; width: 100%; height: auto;"></p>
<p><img src="/images/20220811/messages/pr/pr_22.png" alt="" style="max-width: 490px; width: 100%; height: auto;"></p>
<p><img src="/images/20220811/messages/pr/pr_23.png" alt="" style="max-width: 490px; width: 100%; height: auto;"></p>
<p><img src="/images/20220811/messages/pr/pr_24.png" alt="" style="max-width: 245px; width: 100%; height: auto;"></p>
<p><img src="/images/20220811/messages/pr/pr_25.png" alt="" style="max-width: 245px; width: 100%; height: auto;"></p>
<p><img src="/images/20220811/messages/pr/pr_26.png" alt="" style="max-width: 245px; width: 100%; height: auto;"></p>
<p><img src="/images/20220811/messages/pr/pr_27.png" alt="" style="max-width: 245px; width: 100%; height: auto;"></p>
<p><img src="/images/20220811/messages/pr/pr_28.png" alt="" style="max-width: 245px; width: 100%; height: auto;"></p>
<p><img src="/images/20220811/messages/pr/pr_29.png" alt="" style="max-width: 245px; width: 100%; height: auto;"></p>
<p><img src="/images/20220811/messages/pr/pr_30.png" alt="" style="max-width: 245px; width: 100%; height: auto;"></p>
<p><img src="/images/20220811/messages/pr/pr_31.png" alt="" style="max-width: 490px; width: 100%; height: auto;"></p>
<p><img src="/images/20220811/messages/pr/pr_32.png" alt="" style="max-width: 800px; width: 100%; height: auto;"></p>
<p><img src="/images/20220811/messages/pr/pr_33.png" alt="" style="max-width: 490px; width: 100%; height: auto;"></p>
<p><img src="/images/20220811/messages/pr/pr_34.png" alt="" style="max-width: 245px; width: 100%; height: auto;"></p>
<p><img src="/images/20220811/messages/pr/pr_35.png" alt="" style="max-width: 245px; width: 100%; height: auto;"></p>
<p><img src="/images/20220811/messages/pr/pr_36.png" alt="" style="max-width: 490px; width: 100%; height: auto;"></p>
<p><img src="/images/20220811/messages/pr/pr_37.png" alt="" style="max-width: 490px; width: 100%; height: auto;"></p>
<p><img src="/images/20220811/messages/pr/pr_38a.png" alt="" style="max-width: 800px; width: 100%; height: auto;"></p>
<p><img src="/images/20220811/messages/pr/pr_38b.png" alt="" style="max-width: 245px; width: 100%; height: auto;"></p>
<p><img src="/images/20220811/messages/pr/pr_39.png" alt="" style="max-width: 245px; width: 100%; height: auto;"></p>
<p><img src="/images/20220811/messages/pr/pr_40.png" alt="" style="max-width: 245px; width: 100%; height: auto;"></p>
<p><img src="/images/20220811/messages/pr/pr_41.png" alt="" style="max-width: 245px; width: 100%; height: auto;"></p>
<p><img src="/images/20220811/messages/pr/pr_42.png" alt="" style="max-width: 800px; width: 100%; height: auto;"></p>
<p><img src="/images/20220811/messages/pr/pr_43.png" alt="" style="max-width: 245px; width: 100%; height: auto;"></p>
<p><img src="/images/20220811/messages/pr/pr_44.png" alt="" style="max-width: 245px; width: 100%; height: auto;"></p>
<hr />
<h2>Bandwidth.com's <a href="https://www.bandwidth.com/legal/acceptable-use-policy/">Acceptable Use Policy</a></h2>
<p>Section: Continuous or Repetitive Calls and Messaging.</p>
<p><img src="/images/20220811/bw_aup.png" alt="" style="max-width: 600px; width: 100%; height: auto;"></p>
<h2>Carrier Mappings</h2>
<table>
<thead>
<tr>
<th>Name reported by Twilio</th>
<th>Short name for graphs</th>
</tr>
</thead>
<tbody>
<tr>
<td>Bandwidth SMSEnabled - Bandwidth CLEC - Sybase365</td>
<td>Bandwidth.com</td>
</tr>
<tr>
<td>Bandwidth.com CLEC, LLC</td>
<td>Bandwidth.com</td>
</tr>
<tr>
<td>Bandwidth/13 - Bandwidth.com - SVR</td>
<td>Bandwidth.com</td>
</tr>
<tr>
<td>Bandwidth/20 - Bandwidth.com - SVR</td>
<td>Bandwidth.com</td>
</tr>
<tr>
<td>Bandwidth/Zipwhip/3 - Toll-Free - SVR</td>
<td>Bandwidth.com</td>
</tr>
<tr>
<td>Commio, LLC</td>
<td>Commio</td>
</tr>
<tr>
<td>Google (Grand Central) - SVR</td>
<td>Google Grand Central</td>
</tr>
<tr>
<td>Hook Mobile - Sybase365</td>
<td>Hook Mobile</td>
</tr>
<tr>
<td>Null</td>
<td>Null</td>
</tr>
<tr>
<td>Unknown</td>
<td>Unknown</td>
</tr>
<tr>
<td>Plivo - SVR</td>
<td>Plivo</td>
</tr>
<tr>
<td>T-Mobile USA, Inc.</td>
<td>T-Mobile USA</td>
</tr>
<tr>
<td>Telefinity/teli.net - SVR</td>
<td>Telefinity</td>
</tr>
<tr>
<td>Telnyx - Level3 - SVR</td>
<td>Telnyx</td>
</tr>
<tr>
<td>Telnyx - Telnyx - SVR</td>
<td>Telnyx</td>
</tr>
<tr>
<td>Telnyx - Windstream - SVR</td>
<td>Telnyx</td>
</tr>
<tr>
<td>TextNow - Bandwidth.com - SVR</td>
<td>Bandwidth.com</td>
</tr>
<tr>
<td>TextNow - Neutral Tandem - SVR</td>
<td>TextNow</td>
</tr>
<tr>
<td>Twilio - SMS/MMS-SVR</td>
<td>Twilio</td>
</tr>
<tr>
<td>Twilio - Toll-Free - SMS-Sybase365/MMS-SVR</td>
<td>Twilio</td>
</tr>
</tbody>
</table>
<hr />
<h1>Updates</h1>
<p><strong>August 29, 2022</strong></p>
<p>This morning, I received another realtor call for Stan. I explained the situation to the realtor, and he took the time to share information on their data set with me. This particular realtor is sourcing data from three companies.</p>
<ol>
<li>True People Search [ <a href="truepeoplesearch.io">truepeoplesearch.io</a> ]</li>
<li>Fast People Search [ <a href="fastpeoplesearch.info">fastpeoplesearch.info</a> ]</li>
<li>Lexis Nexis [ <a href="lexisnexis.com">lexisnexis.com</a> ]</li>
</ol>
<p>He also shared that there were two phone numbers listed for Stan. My phone number and another number with the same last seven digits, but an alternate area code. Example:</p>
<ul>
<li>415-123-1234</li>
<li>210-123-1234</li>
</ul>
<p>I appreciate another realtor taking the time to help triage data broker false positives. I recommend inquiring into the false positive rate and data validation methods before entering into any contract or commercial services with these companies.  There is no method for me to validate the statistical occurrence of matching numbers for individuals across area codes. I suspect the occurrence rate to be exceptional small and this example indicative of serious quality control issues. </p>
<p>I have reached out via a contact form to Fast People Search and email to True People Search in an attempt to have my information removed from Stan's records. Both sites have opt-out forms, but they are only for the person associated with the data, Stan.</p>
<hr />
<p>(c) Michael Bentley 2022</p>
<p>Contents may not be republished without written consent.</p>]]&gt;</description>
</item>
</channel>
</rss>