<?xml version="1.0" encoding="utf-8"?><feed xmlns="http://www.w3.org/2005/Atom" ><generator uri="https://jekyllrb.com/" version="3.10.0">Jekyll</generator><link href="https://gfrison.com/feed.xml" rel="self" type="application/atom+xml" /><link href="https://gfrison.com/" rel="alternate" type="text/html" /><updated>2026-02-25T10:32:57+01:00</updated><id>https://gfrison.com/feed.xml</id><title type="html">Accidental Pitch ♮</title><subtitle></subtitle><author><name>Giancarlo Frison</name></author><entry><title type="html">Pull Down Programming Complexity with Kubrick</title><link href="https://gfrison.com/2026/pull-down-programming-complexity-kubrick" rel="alternate" type="text/html" title="Pull Down Programming Complexity with Kubrick" /><published>2026-01-31T00:00:00+01:00</published><updated>2026-02-03T00:00:00+01:00</updated><id>https://gfrison.com/2026/pull-down-programming-complexity-kubrick</id><content type="html" xml:base="https://gfrison.com/2026/pull-down-programming-complexity-kubrick"><![CDATA[<style>
.responsive-figure {
  width: 100%;
  margin: auto;
  text-align: center;
}
@media (min-width: 768px) {
  .responsive-figure {
    width: 70%;
  }
}
</style>

<p>AI tools are now the horsepower of computer programming. They are generally great for writing glue-code and integration tasks, probably less than ideal for complex problems on complicated programming settings. What could be the reasons that prevent generators on full-scale adoption?</p>

<p>I am implementing a declarative programming language that facilitates the synergy between automatons and humans on software development by forcing AI tools to generate intuitive code and to allow human operators to understand what is in there. It is an attempt to lower the barriers by simplifying the programming experience.</p>

<p>For a rationale, I will touch here various aspects that includes cognitive aspects of problem-solving applied to programming and the role of AI tools such as LRMs on software development. I will provide arguments for stating that the accidental complexity of the programming system may worsening the performance of AI generators as well as it affects human developers.</p>

<p class="notice--primary">Maybe you want to read <em>“Pull Down Complexity with Kubrick”</em> in <a href="/assets/pull-down/px26.pdf">PDF</a>?</p>

<p>Started as a <em>data programming</em> language for pairing database queries with programming controls, <a href="https://gfrison.com/kubrick/">Kubrick</a> may evolve in an advance integrated programming environment with the footprint of a <em>Jupiter notebook</em> and the immediacy of a <em>spreadsheet</em>. It lets agents to focus on what they want to achieve making easy things easily without expressivity losses.</p>

<p>To mitigate the lack of solution productivity<sup id="fnref:productivity" role="doc-noteref"><a href="#fn:productivity" class="footnote" rel="footnote">1</a></sup> in generators - the way that an optimizer generates deriving combinations of solutions from a set of given axioms - I extended the language for helping agents to cover <em>combinatorial problems</em>. Think of answer set programming (ASP) but with an eye on integration and usability.</p>

<p>This is an ongoing project that was the topic of my Master’s dissertation <em>“Programming Language and System for Enhancing AI-Assisted Software Development”</em> I defended last December 2025, and it summarized several insights I gathered during my experience in the field. I just recently started to open source it and I will gradually share it on GitHub<sup id="fnref:kubrick" role="doc-noteref"><a href="#fn:kubrick" class="footnote" rel="footnote">2</a></sup>.</p>

<h1 id="cognitive-aspects-of-programming">Cognitive Aspects of Programming</h1>

<p>AI tools are everywhere. In every domain it is possible to find features that can benefit of AI capabilities. If we exclude certain high-risk sectors - nuclear plants, aviation control - AI is already assisting human labour, and the pace of adoption will certainly accelerate.</p>

<p>Language models were conceived for NLP tasks and trained with all sort of available text, so it is quite natural to think about their application in programming code generation. After all, programming is text-based for being written and read like any other text. Is it then reasonable that LRMs exhibits appreciable programming skills as they do on generating phrases?</p>

<p>With AI Tools it is not so bizarre to ship live entire applications in hours rather than weeks, but their effectiveness may fall short on accomplishing what prompters want them to do. Complaints regarding GenAI performances are summarized in:</p>

<ul>
  <li>Problems with multi-step reasoning.</li>
  <li>Struggles with mutable states and side effects.</li>
  <li>Shallow code understanding.</li>
  <li>Fail to meet requirements.</li>
  <li>Generate complicated code.</li>
  <li>Despite same prompt, they generate different code.</li>
  <li>Despite different prompt, they generate the same code.</li>
</ul>

<p>Building a program essentially means to face it from two distinct sides: the <a href="/2025/adaptive-programming-systems"><strong>problem domain</strong> and the <strong>solution domain</strong></a>. Understanding the problem to solve is the most important task, and if it is more or less difficult to grasp, it is related to its <em>essential complexity</em>. On the other hand, the complexity of the solution domain is referred to as <em>accidental complexity</em>, and it is introduced by the ecosystem necessary to implement the solution.</p>

<blockquote>
  <p><em>“If I had an hour to solve a problem, I’d spend 55 minutes thinking about the problem and 5 minutes thinking about the solution”</em> - A. Einstein</p>
</blockquote>

<p>The programming system consists of several <em>substrates</em> that include libraries, tools, and external systems, and when it increases the programming effort increases exponentially. While the essential complexity isn’t negotiable, the accidental one must be kept at the lower level possible.</p>

<figure class="responsive-figure">
  <img src="/assets/pull-down/complexity.png" alt="Essential vs Accidental Complexity" />
  <figcaption>Essential vs Accidental Complexity</figcaption>
</figure>

<p>Many difficulties automatons and developers shows on generating code might be due to a common root: <u>the accidental complexity carried by the programming system</u>. The intuition behind this project is to address this issue from its foundations.</p>

<p class="notice--info">While increasing the complexity of a system absorbs more of the engineer’s working memory, it also increases the perplexity<sup id="fnref:perplexity" role="doc-noteref"><a href="#fn:perplexity" class="footnote" rel="footnote">3</a></sup> an LRMs on solving the same task. Perplexity measures how unexpected a token is to the LRMs, and that means that the accidental complexity negatively impacts the generation of code, with more chances of introducing bugs not only by developers but also by automatons. Humans and the artificial agents show a significant positive correlation when faced with similar alienating settings.</p>

<h1 id="how-to-reduce-complexity">How to reduce complexity</h1>

<figure class="responsive-figure">
  <img src="/assets/pull-down/plagiarism.png" alt="Plagiarism" />
  <figcaption>For favoring AI programming, apply the opposite strategies used to prevent plagiarism</figcaption>
</figure>

<p>Do you want to reduce accidental complexity and favor agents on programming? Be inspired by what teachers do for preventing<sup id="fnref:plagiarism" role="doc-noteref"><a href="#fn:plagiarism" class="footnote" rel="footnote">4</a></sup> plagiarism through AI, and apply the opposite strategies for making programming easier:</p>

<h3 id="uniform-experience">Uniform experience</h3>

<p>From the point of programming experience, what is daunting<sup id="fnref:perplexity:1" role="doc-noteref"><a href="#fn:perplexity" class="footnote" rel="footnote">3</a></sup> for agents is the heterogeneity of the ecosystem. Different paradigms for configuration, data access, service orchestration, remote procedure calls, testing and deployment. Each of those aspects requires specific skills and knowledge that increases accidental complexity. A <a href="/2025/adaptive-programming-systems">proper programming environment</a> should nullify those frictions by providing a consistent an uniform way to interact with it.</p>

<h3 id="open-authorship">Open authorship</h3>

<p>Software development often involves a hard separation between programmers and users. I believe that accidental complexity prevents final consumers to be empowered to change the software applications. Are there already some examples that allow that?</p>

<p>I think everybody has at least once used a spreadsheet application. One thing that’s very clear to users is that the spreadsheet does not have really a separate environment for programming and for use. A spreadsheet can be modified at any time by modifying the data or the formulas it contains.</p>

<p>Notebooks like Jupyter or Google Colab allow users to naturally split problems into smaller pieces, solve them individually with an immediate feedback. Those approaches lower the accidental complexity and Kubrick aims to combine both.</p>

<h3 id="self-sustainability">Self-sustainability</h3>

<blockquote>
  <p>Self-sustainability refers to the extent to which a system’s behavior can be changed without having to step outside to a lower implementation level<sup id="fnref:sustainability" role="doc-noteref"><a href="#fn:sustainability" class="footnote" rel="footnote">5</a></sup></p>
</blockquote>

<p>The most predictor of a low-code platform’s success is the ability to change the system’s behaviour from the deepest layers. This would be achieved by introducing <em>macros</em>, programming fragments intended to generate programs inside the programming system itself. Why not being inspired by traditional languages like <em>Lisp</em> for that? It has stood the test of time with great honor, thanks also to its homoiconic nature that enables to write macros easily.</p>

<h3 id="logic--functional">Logic + functional</h3>

<p>If we consider AI as an assistant, the programmer double-checks that what has been generated satisfy explicit and implicit requirements. Software code is more read than written and the immense capacity of the generators to flush out large quantity of code can easily saturate human scrutiny, urging the necessary of a clear, concise and easy language for encoding programs. We need to let agents to express <em>what</em> they want to achieve more than <em>how</em> to achieve it and remove the need of boilerplate code.<br />
Immutability, control over side-effects, unification, pattern-matching can definitely help on dragging down complexity.</p>

<h3 id="relation-algebra">Relation algebra</h3>

<p>I think one of the main complexity drivers is the impedance mismatch between query and programming languages. Those idiosyncratisms are usually mediated by ORMs frameworks but their slippery slope<sup id="fnref:orm" role="doc-noteref"><a href="#fn:orm" class="footnote" rel="footnote">6</a></sup> can trigger more problems than they solve. Relation algebra is the basic foundation of database theory and when combined with programming constrols, it provides a smooth experience for data-programming.</p>

<figure class="responsive-figure">
  <img src="/assets/pull-down/join.png" alt="Relation Algebra as a unifying layer" />
  <figcaption>Relation Algebra as a unifying layer</figcaption>
</figure>

<h3 id="combinatorics">Combinatorics</h3>

<p>When programmers encounter new code, they <em>actively simulate</em> the program’s behaviour in their head and create mental models of its structure and logic. This is why programming is more than a logically demanding task<sup id="fnref:cognition" role="doc-noteref"><a href="#fn:cognition" class="footnote" rel="footnote">7</a></sup> with significant implications on how people interact with code. This is confirmed by the use debugging tools, syntax highlighting, code formatting, visualization applets; are all there for supporting the brain’s reliance on logical simulations.</p>

<p>The attitude on playing code behaviours confirms that intelligence can’t be diverted from the ability to search through a potential infinite combination of concepts and rules. This is in summary the idea of productivity which refers to the compositional generation of optimal propositions from a valid set of grounded statements.</p>

<p><em>Do automatons excel on that</em>? The building blocks of LRMs lack of intrinsic search, though efforts have been applied<sup id="fnref:reasoning" role="doc-noteref"><a href="#fn:reasoning" class="footnote" rel="footnote">8</a></sup> on forcing recursive reiterations on their conclusions for minimizing hallucinations. How would be possible to inject intentional search where it is lacking? This is why combinatorial programming can boost applications’ intelligence by commoditizing optimized solution search.</p>

<h1 id="make-easy-things-easily">Make easy things easily</h1>

<h2 id="and-complex-things-doable">…and complex things doable.</h2>

<p>Not a single artefacts comes out of the vacuum and Kubrick is not an exception. I’ve been inspired by a multitude of established ideas on re-arranging them and creating new ones. From <strong>Prolog</strong> I took the variables’ unification, the automatic pattern matching on method’s activation and the success/failure assertions for validating method invocations, typical of logic programming. From <strong>Julia</strong> I’ve borrowed the fundamental recursive data structures for combining choices, named tuples and sequences. From <strong>ASP</strong> I’ve taken the <em>stable model</em> semantics for reasoning on alternative solutions and <em>choices</em>, the alternative values.</p>

<h3 id="a-combinatorial-use-case">A Combinatorial use case</h3>

<p>Central to la language is the recursive data type that include sequences, named tuples and choices:</p>
<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>grocery orange type-&gt;fruit price-&gt;1.2,
  rating-&gt;5 origin-&gt;italy;spain 
</code></pre></div></div>
<p>Choices are alternative values (or expressions) to say that the fruit can either be 1st, 2nd, bio or local quality. Named tuples are key-value pairs and sequences are ordered collections, and any element can be a nested data structure.</p>

<p>Let’s assume we need to create purchase lists with only one item per type and grouped by origin:</p>
<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>cart Item -|1^Origin,1#Type| 
  grocery Item type-&gt;Type origin-&gt;Origin
</code></pre></div></div>

<p>You need to filter only purchase below 30€:</p>
<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>\ cart Item, grocery Item price-&gt;Price, 
    Tot-&gt;(group sum Price),
    &lt; Tot 30
</code></pre></div></div>
<p>and get the best combination of items that <em>maximizes</em> product’s ratings and <em>minimizes</em> the total cost. View it as a sort of multi-objective Pareto optimization:</p>
<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>\&gt; cart Item, grocery Item rating-&gt;Rating,
     group sum Rating
\&lt; cart Item, grocery Item price-&gt;Price, 
     group sum Price
</code></pre></div></div>

<h3 id="user-experience">User experience</h3>

<p>The prototype’s interface resemble a notebook environment with cells for writing code, for AI prompting and for visualizing results. While the program generation could be delegated to external LRMs, the program execution is performed locally in the browser. This approach combines the divide/conquer of notebooks, the immediacy of spreadsheet applications and the assisted program generation.</p>

<figure class="responsive-figure">
  <img src="/assets/pull-down/console.png" alt="Kubrick Notebook Interface" />
  <figcaption>Kubrick Notebook Interface</figcaption>
</figure>

<h3 id="process-flow">Process flow</h3>

<p>When the user submit the prompt, the web application (Capriccio) delegate a specialized module (Avro) that augments the user prompt with Kubrick language reference documentation and forward the enriched request to the AI service. The generated code is then executed by the web application that display the results in the notebook cell.</p>

<figure class="responsive-figure">
  <img src="/assets/pull-down/kubrick-sequence.drawio.png" alt="Kubrick Sequence Diagram" />
  <figcaption>Kubrick Sequence Diagram</figcaption>
</figure>

<h1 id="compile-time-workflows">Compile-time Workflows</h1>

<p>The glamoured <em>“agentic AI”</em> trend put LRMs in the main stage of complex programming exploiting their ability to elaborate decisions based on intricate set of input data. In the reasoning and acting (ReAct) paradigm the LLM is called during the transition in a broader state-machine that describes the the entire process. Basically, the AI tool is adopted not for planning the entire workflow where specialized modules can be engaged for specific tasks, but for deciding which action to take at each step of the process. It is a quite limiting approach that just raises the complexity of the overall solution.</p>

<p>Why not let the AI tools to generate the entire workflow at once? Attempts<sup id="fnref:agents" role="doc-noteref"><a href="#fn:agents" class="footnote" rel="footnote">9</a></sup> in this direction have demonstrated that it is possible to substantially improve the effectiveness of AI generators. This achievement reinforces the thesis that a highly expressive and declarative language like the one presented in this project can improve the impact of agentic AI.</p>

<h3 id="next-steps-mcp-integration">Next Steps: MCP integration</h3>

<p>We can inform the generator about available functions and their interfaces (following Kubrick’s language) and run the GenAI as an orchestrator and model the entire process in <em>compile-time</em>. <br />
Available functions can be those imported from libraries but also those exposed through model context protocols (MCP), a revisited protocol for service discovery.<br />
In this way, the single components can be invoked by the symbolic runtime that execute the generated code with more fine-grained control over its execution.</p>

<figure class="responsive-figure">
  <img src="/assets/pull-down/mcp.png" alt="MCP Integration" />
  <figcaption>MCP Integration</figcaption>
</figure>

<h1 id="lets-meet-in-programming-experience-workshop-2026">Let’s Meet in <em>Programming Experience Workshop 2026</em></h1>

<p>This year the <code class="language-plaintext highlighter-rouge">&lt;Programming&gt; 2026</code> will take place in Munich, Germany on Mar 16-20, 2026. I will present this work in the <em>Programming Experience Workshop 2026</em><sup id="fnref:px2026" role="doc-noteref"><a href="#fn:px2026" class="footnote" rel="footnote">10</a></sup>. If you are around, feel free to reach me for a chat!</p>

<figure class="responsive-figure">
  <a href="/kubrick/">
    <img src="/assets/pull-down/main.jpg" alt="Kubrick" />
  </a>
  <figcaption>Click for a short try on Kubrick (without LRMs). NB: still buggy, be forgiven</figcaption>
</figure>
<p><a href="https://gfrison.com">Giancarlo Frison</a></p>

<h2 id="references">References</h2>

<div class="footnotes" role="doc-endnotes">
  <ol>
    <li id="fn:productivity" role="doc-endnote">
      <p>Anderson, J. R. (1988). The expert module. In M. C. Polson &amp; J. J. Richardson (Eds.), <em>Foundations of intelligent tutoring systems</em> (pp. 21-53). Psychology Press. https://www.sciencedirect.com/science/article/abs/pii/0010027788900315 <a href="#fnref:productivity" class="reversefootnote" role="doc-backlink">&#8617;</a></p>
    </li>
    <li id="fn:kubrick" role="doc-endnote">
      <p>Frison, G. (n.d.). <em>Kubrick</em> [Computer software]. GitHub. https://github.com/gfrison/kubrick <a href="#fnref:kubrick" class="reversefootnote" role="doc-backlink">&#8617;</a></p>
    </li>
    <li id="fn:perplexity" role="doc-endnote">
      <p>Li, Y., Zhang, Y., &amp; Wang, Z. (2024). Understanding the impact of code complexity on large language models. <em>arXiv preprint arXiv:2508.18547</em>. https://arxiv.org/abs/2508.18547 <a href="#fnref:perplexity" class="reversefootnote" role="doc-backlink">&#8617;</a> <a href="#fnref:perplexity:1" class="reversefootnote" role="doc-backlink">&#8617;<sup>2</sup></a></p>
    </li>
    <li id="fn:plagiarism" role="doc-endnote">
      <p>McDanel, B., Bunch, L., &amp; Krishnamurthi, S. (2024). Pedagogical strategies for mitigating AI-assisted plagiarism in programming education. In <em>Proceedings of the 55th ACM Technical Symposium on Computer Science Education</em> (pp. 1-7). https://bradmcdanel.com/wp-content/uploads/24_SIGCSE_LLM.pdf <a href="#fnref:plagiarism" class="reversefootnote" role="doc-backlink">&#8617;</a></p>
    </li>
    <li id="fn:sustainability" role="doc-endnote">
      <p>Jakubovic, J., Kell, S., &amp; Rein, P. (2023). Self-sustainability: Systems that can modify themselves. <em>arXiv preprint arXiv:2302.10003</em>. https://arxiv.org/abs/2302.10003 <a href="#fnref:sustainability" class="reversefootnote" role="doc-backlink">&#8617;</a></p>
    </li>
    <li id="fn:orm" role="doc-endnote">
      <p>Neward, T. (2006). The Vietnam of computer science. <em>ODBMS Industry Watch</em>. https://www.odbms.org/wp-content/uploads/2013/11/031.01-Neward-The-Vietnam-of-Computer-Science-June-2006.pdf <a href="#fnref:orm" class="reversefootnote" role="doc-backlink">&#8617;</a></p>
    </li>
    <li id="fn:cognition" role="doc-endnote">
      <p>Ivanova, A. A., Srikant, S., Sueoka, Y., Kean, H. H., Dhamala, R., O’Reilly, U.-M., Bers, M. U., &amp; Fedorenko, E. (2020). Comprehension of computer code relies primarily on domain-general executive brain regions. <em>eLife</em>, <em>9</em>, Article e58906. https://doi.org/10.7554/eLife.58906 <a href="#fnref:cognition" class="reversefootnote" role="doc-backlink">&#8617;</a></p>
    </li>
    <li id="fn:reasoning" role="doc-endnote">
      <p>Zhang, Y., Li, M., &amp; Chen, X. (2024). Enhancing language model reasoning through recursive iteration. <em>arXiv preprint arXiv:2411.17708</em>. https://arxiv.org/abs/2411.17708 <a href="#fnref:reasoning" class="reversefootnote" role="doc-backlink">&#8617;</a></p>
    </li>
    <li id="fn:agents" role="doc-endnote">
      <p>Wang, L., Ma, C., Feng, X., Zhang, Z., Yang, H., Zhang, J., Chen, Z., Tang, J., Chen, X., Lin, Y., Zhao, W. X., Wei, Z., &amp; Wen, J.-R. (2024). A survey on large language model based autonomous agents. <em>arXiv preprint arXiv:2402.01030</em>. https://arxiv.org/abs/2402.01030 <a href="#fnref:agents" class="reversefootnote" role="doc-backlink">&#8617;</a></p>
    </li>
    <li id="fn:px2026" role="doc-endnote">
      <p>Programming Experience Workshop 2026. (2026). In <em>Proceedings of the 2026 ACM SIGPLAN International Conference on Programming</em>. https://2026.programming-conference.org/home/px-2026 <a href="#fnref:px2026" class="reversefootnote" role="doc-backlink">&#8617;</a></p>
    </li>
  </ol>
</div>]]></content><author><name>Giancarlo Frison</name></author><category term="hci" /><category term="cognitive system" /><category term="program-synthesis" /><category term="programming" /><category term="neural-symbolic" /><category term="llm" /><category term="low-code" /><category term="px26" /><summary type="html"><![CDATA[AI tools and programmers struggle with complexity given by the programming ecosystem itself. Kubrick is a declarative language designed to reduce accidental complexity through logic programming, functional paradigms, and relation algebra. Lower the friction, and generators work better. Make easy things easily while keeping expressiveness intact.]]></summary></entry><entry><title type="html">Adaptive Programming Systems for Humans and AI</title><link href="https://gfrison.com/2025/adaptive-programming-systems" rel="alternate" type="text/html" title="Adaptive Programming Systems for Humans and AI" /><published>2025-09-15T00:00:00+02:00</published><updated>2025-09-22T00:00:00+02:00</updated><id>https://gfrison.com/2025/adaptive-programming-systems</id><content type="html" xml:base="https://gfrison.com/2025/adaptive-programming-systems"><![CDATA[<p>In the BBC series “How Buildings Learn”, Stewart Brand discusses how buildings evolve over time and, in particular, what distinguishes good buildings from bad ones. One way to answer that is to look for commonalities among those still standing centuries after their construction. Excluding natural disasters like earthquakes, poor quality of materials and construction techniques, or the possible decline of the locality, the longevity of buildings depends on the satisfaction of their inhabitants in maintaining them. How can a static structure satisfy generations of owners? We might be tempted to think that the initial purpose of a building should not vary much over time due to its immutability. That would be correct, but only if we do not include the human element in the equation. Unlike buildings, people move and engage in all sorts of activities. The dynamism is brought by the users, and as expected, it changes over time. It turns out the longevity of a facility is determined by <strong>how easily it can meet new demands</strong>, and those demands are dictated by the kinds of interactions people have with the building during their activities.</p>

<p>The analogy can be applied to software applications. Comparing the average lifespan of a building with that of a software application, the impact of adaptability is more evident. Applications are disposable and prone to continuous updates, and cases of legacy systems remaining untouched after decades are as rare as anecdotal. Software is built to let users achieve specific goals. As building inhabitants change their uses and, consequently, their purposes and <em>mediators</em>, so too are programs clearly subject to evolution.</p>

<p class="notice--primary">Mediators: point of interaction between agents (user, developers and AI generators) and the system. They are affordances such as APIs, design patterns, frameworks, but also GUIs, command lines, etc.</p>

<p>While users interact with the application via its available mediators, programmers leverage a different kind of them offered by the <em>programming system</em> to crystallize the output of their work into a runnable application.</p>

<p class="notice--primary">Programming system: Integrated and complete set of tools sufficient for creating, modifying and performing programs. It encompasses programming languages, libraries, frameworks, development environments, and other tools that facilitate the software development process.</p>

<p>The more adaptive a programming system is, the easier it is to apply changes - and at a higher rate - than with less adaptive ones. A flexible programming system is better for the output. If the application is appreciated by customers, change requests inevitably arise, and an important aspect of evaluating whether to build a new application from scratch or update the current one is to consider the potential adaptability to changes. Certain mediators are involved in changing existing software, especially those more inherently tied to the application’s components. We can refer to them as <em>substrates</em>.</p>

<p class="notice--primary">Substrates: various underlying mechanisms or interfaces that allow to interact with and modify the software​.</p>

<p>Substrates are the layers that encapsulate different modules and environments. They could be related to data management, dominated by the database interface, for example. Mediators may be also volatile archetypes, like prescribed design patterns for software development. When a generic design is found for a recurrent problem pattern, the solution is actually an abstract template for lowering the accidental complexity of the application. That archetype could be seen as a substrate applied to the application for a common problem.</p>

<figure class="align-center" style="width: 70%">
  <img src="/assets/images/adaptive-programming-systems/complexity.png" alt="Programming Effort and Complexity" />
  <figcaption aria-hidden="true">How Programming Effort is related to Complexity</figcaption>
</figure>

<p class="notice--primary">Accidental vs Essential complexity: if you have ever faced a programming challenge like the ones you may be submitted during interviews, you may have experienced the frustration of not being able to solve it. This is the feeling the essential complexity manifests itself. On the other hand, when the programmer should update an intricated legacy system without documentation, the desperation can easily take over. The sorrow is now related to the accidental complexity, which is the kind of trouble <em>introduced</em> into the ecosystem. The essential is not negotiable, but the accidental must be kept at the lowest level.</p>

<h1 id="let-final-users-be-programmers">Let final users be programmers</h1>

<p>Mediators vary depending on the role of the agent. Those accessible to consumers may differ from those engaged by producers, as they are indeed very different. Programmers leverage mediators from the programming system, while end users interact with the application through exposed substrates, like GUIs. This separation is usually irreconcilable. Users may not expect to modify the program they are using. In fact, they have almost no chance to change it by <em>themselves</em>. More adaptability also means <em>open authorship</em>, a principle that empowers different roles to modify the software from lower-level substrates than those reserved for using it.</p>

<h2 id="real-cases-of-open-authorship">Real cases of open authorship</h2>

<p>Are there any live examples of open authorship out there? If you are reading this article, you are probably doing so through a browser. You can open the menu, access “developer tools,” and inspect the page. If you have a bit of HTML knowledge, you can modify the page directly. Pay attention to the page itself, because changes are instantly reflected in the rendered view. The HTML substrate is accessible and modifiable, and the feedback is immediate. A browser lets the consumer be a producer to some extent at the same time. Spreadsheets are another example of open authorship. The grid is visually accessible, and the engine works at the cell level. Users can modify the data and the formulas, and results are promptly reported. Moreover, the formula substrate is hierarchical, as formulas can be encoded at different levels of capability, from simple domain-specific languages to more complex languages like <code class="language-plaintext highlighter-rouge">VBScript</code> or <code class="language-plaintext highlighter-rouge">Python</code>.</p>

<h1 id="whats-programming">What’s programming?</h1>

<p>Empowering users to modify software without the help of professional developers is a way to make it more resilient to the changes that occur over time and to extend the application’s longevity. As mentioned before, this can’t be achieved by eliminating the essential complexity of the problem. The user - whether programmer or <em>empowered</em> user - necessarily needs to understand the problem first and then implement a solution. As Einstein said, <em>“95% of the time is spent understanding the problem; the little rest is on finding the solution”</em>; the essential complexity is the critical part that programming must address first.</p>

<p>For human agents, <strong>programming is a logical task, not a language-based one</strong>. The brain is not just parsing text when interpreting a program. Rather, it is actively simulating the program’s behavior, following its logical sense and interpreting the effects. It’s quite a different cognitive process than following a text or a flow of thoughts. This may be why a fully committed programmer may exhibit contrasting language attitudes - as vague stereotypes circulating around <em>nerds</em>  may confirm.</p>

<p>Evidence of the logical nature of programming can be seen in the tools usually offered by programming systems. Debuggers, code highlighting, code formatting, and other visualization tools are aimed at not exceeding the memory span and to focus user’s attention where it really matters. Those tools are designed for lowering the cognitive load of the mental module charged with the programming task.</p>

<h2 id="what-about-ai-agents-is-their-path-toward-code-generation-similar-to-human-programming">What about AI agents? Is their path toward code generation similar to human programming?</h2>

<p>Large language/reasoning models (LLRMs) show excellent results in many different tasks, demonstrating a general intelligence, but with uneven performance. To put it simply, LLRMs are essentially next-token predictors, trained to provide a probability distribution of what comes next in a given sequence of symbols. I’m afraid, it is evident that the answer to the previous question is negative. LLRMs are not even close, in abstract terms, to the way humans program. This is not, per se, an insurmountable problem; it just helps to understand why some problems may occur when using those agents to generate code. Recurrent complaints with LLRMs are summarized as follows:</p>

<ol>
  <li><strong>Struggles with mutable states and side effects</strong>. Imperative programming languages are more prone to induce errors in AI agents due to the increasing burden of tracking the state of a program while it operates.</li>
  <li><strong>Shallow code understanding</strong>. LLRMs focus on the syntax and semantics of parts of code (like variable names) and less on the overall logic of the program. For example, obfuscated code causes AI agents to fail even on simple problems.</li>
  <li><strong>Problems with multi-step reasoning</strong>. This is a generalization of the first point.</li>
  <li><strong>Generates complicated code</strong>. LLRMs tend to produce convoluted code, increasing rather than lowering complexity. I guess this problem is due to the already high accidental complexity of the ecosystem, forcing automatons to fill in the boilerplate with even more glue code.</li>
  <li><strong>Fails to meet requirements</strong>. Generators do not implement the set of prescriptions in their entirety and refuse to implement them despite repeated requests.</li>
  <li><strong>Despite different prompts, it generates the same code</strong>.</li>
  <li><strong>Despite the same prompt, it generates different code</strong>.</li>
</ol>

<h1 id="all-in-one-development-environment">All-in-one Development Environment</h1>

<p>Some difficulties are <strong>remarkably similar</strong> to those encountered by human developers. Points (1), (3), and (4) are related to the inherited ecosystem and its idiosyncrasies. When it comes to the irrelevant and harmful composition of poorly designed substrates, the burden of keeping them aligned rises exponentially. Misaligned substrates are the source of <em>impedance mismatch</em> that necessitate further sub-problem elaboration specific to the amount of glue code involved. This affects not just humans but automatons as well.</p>

<p class="notice--primary">Impedance mismatch: it refers to the difficulties when different system with incompatible data models need to interface with each other. For example, the object-oriented data model in Java clashes with relational model and with the interface (SQL) used in databases.</p>

<h1 id="unified-mode-of-programming">Unified Mode of Programming</h1>

<p>To mitigate the problem, the goal is to build a platform using a minimum number of technologies, glued together in a programming system that allows the definition of a <strong>multitude of paradigms by adopting only a single way of expression</strong>: a core language sufficient to formalize data, computation, and external service access. This universal language should coherently allow many problem representations, minimizing the accidental complexity that derives from using non-unified integrated development systems.</p>

<p>I began this post by stressing the importance of adaptability for software success, and I can’t end it without mentioning an aspect of the low-code programming paradigm. In particular, if it is important to lower the cognitive barriers for expanding the pool of agents who can be actively involved in changing the tools they are using or creating, then those languages should be elastic enough to allow new versions of themselves specific to a domain (domain-specific language or DSL) while keeping unnecessary code away. A common feature of successful low-code languages is the capability to be changed by agents solely by using the language itself. This is the principle of <em>self-sustainability</em>.</p>

<p class="notice--primary">Self-sustainability refers to the extent to which a system’s behavior can be changed from within itself, without having to <em>step outside</em> to a lower implementation level. A programming system that embraces self-sustainability allows its inner workings to be accessible from the user level, usually through macros, which are snippets of code that generate other code.</p>

<p>There is no conclusion to this post, this is just the part of the journey towards understanding what programming is and how to match humans and AI in a common adaptive ecosystem. <a href="https://gfrison.com">Giancarlo Frison</a></p>

<ul>
  <li>Basman, A., Tchernavskij, P., Bates, S., &amp; Beaudouin-Lafon, M. (2018, April). An anatomy of interaction: co-occurrences and entanglements. In Programming’18 Companion - Conference Companion of the 2nd International Conference on Art, Science, and Engineering of Programming, Nice, France (pp. 188-196). ACM. https://doi.org/10.1145/3191697.3214328</li>
  <li>Clark, C., &amp; Basman, A. (2017). Tracing a Paradigm for Externalization: Avatars and the GPII Nexus. In Companion to the First International Conference on the Art, Science and Engineering of Programming (Programming ’17) (Article 31, 5 pages). ACM. https://doi.org/10.1145/3079368.3079410</li>
  <li>Jakubovic, J., Edwards, J., &amp; Petricek, T. (2023). Technical Dimensions of Programming Systems. The Art, Science, and Engineering of Programming, 7(3), 13. https://doi.org/10.22152/programming-journal.org/2023/7/13</li>
  <li>Petricek, T. (2022, April 28). No-code, no thought? Substrates for simple programming for all.</li>
</ul>]]></content><author><name>Giancarlo Frison</name></author><category term="hci" /><category term="program-synthesis" /><category term="programming" /><category term="neural-symbolic" /><category term="llm" /><category term="low-code" /><summary type="html"><![CDATA[How programming systems can adapt to evolving user and AI needs? Drawing parallels with architecture, I enphasize the importance of flexibility, open authorship, and user empowerment, and discusses challenges faced by both humans and AI in program synthesis]]></summary></entry><entry><title type="html">Hybrid AI for Generating Programs: a Survey</title><link href="https://gfrison.com/2025/hybrid-ai-for-generating-programs" rel="alternate" type="text/html" title="Hybrid AI for Generating Programs: a Survey" /><published>2025-05-03T00:00:00+02:00</published><updated>2025-05-03T00:00:00+02:00</updated><id>https://gfrison.com/2025/hybrid-ai-for-generating-programs</id><content type="html" xml:base="https://gfrison.com/2025/hybrid-ai-for-generating-programs"><![CDATA[<p><em>Computer programming is a specialized activity that requires long training and experience to match productivity, precision and integration. It hasn’t been a secret for AI practitioners to ultimately create software tools that can facilitate the role of programmers. The branch of AI dedicated to automatically generate programs from examples or some sort of specification is called program synthesis. In this dissertation, I’ll explore different methods to combine symbolic AI and neural networks (like large language models) for automatically create programs. The posed question is: <strong>How AI methods can be integrated for helping to synthesize programs for a wide range of applications?</strong></em></p>

<figure style="width: 400px" class="align-center">
  <img src="/assets/images/hybrid-ai-generating-programs-media/poster.jpg" />
</figure>

<p>Hybrid AI brings together two very different approaches: symbolic AI, which works like traditional programming with rules and logic, and connectionist AI, which relies on neural networks, where large language models (LLMs) being the most advanced example. In this dissertation, I review some literature that tries to combine these methods to leverage their strengths for generating programs. I focus on the most interesting and useful papers, setting aside those that were less clear or relevant. While this overview may simplify some aspects or overlook certain details, my aim is to clarify the key ideas and highlight what has been achieved so far in making these two approaches work together.</p>

<p>Program synthesis (PS) is the automatic process of generating programs that accomplish specified objectives. A program consists of a set of instructions written in a formal language a symbolic engine can interpret and execute. Because of that, programs must be exactly right to achieve the intended outcome otherwise any deviation may lead to incorrect results or even the impossibility to run the program. The central challenge is the transformation of input - which often looks very different to the final program - into working code. The input may take different forms, as pointed below.</p>

<p class="notice--primary">You can also <a href="/assets/hybrid-ai-generating-programs.pdf">Download the PDF version</a> of this dissertation</p>

<h3 id="input-as-formal-specification">Input as formal specification</h3>

<p>The requirements might be written in a formal language, like test cases in the same language of the generated code or a even more abstract logical form. This is typical of the SyGus <sup id="fnref:1" role="doc-noteref"><a href="#fn:1" class="footnote" rel="footnote">1</a></sup> challenge, where specification are runnable code snippets.</p>

<p>An example of PS specification states the constraints of the new function <code class="language-plaintext highlighter-rouge">max2</code>, its signature and the tests for correctness:</p>

<div class="language-lisp highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="c1">;; The background theory is linear integer arithmetic</span>
<span class="p">(</span><span class="nv">set-logic</span> <span class="nv">LIA</span><span class="p">)</span>
<span class="c1">;; Name and signature of the function to be synthesized</span>
<span class="p">(</span><span class="nv">synth-fun</span> <span class="nv">max2</span> <span class="p">((</span><span class="nv">x</span> <span class="nv">Int</span><span class="p">)</span> <span class="p">(</span><span class="nv">y</span> <span class="nv">Int</span><span class="p">))</span> <span class="nv">Int</span>    
    <span class="c1">;; Declare the non-terminals that would be used in the grammar</span>
    <span class="p">((</span><span class="nv">I</span> <span class="nv">Int</span><span class="p">)</span> <span class="p">(</span><span class="nv">B</span> <span class="nv">Bool</span><span class="p">))</span>
    <span class="c1">;; Define the grammar for allowed implementations of max2</span>
    <span class="p">((</span><span class="nv">I</span> <span class="nv">Int</span> <span class="p">(</span><span class="nv">x</span> <span class="nv">y</span> <span class="mi">0</span> <span class="mi">1</span>
             <span class="p">(</span><span class="nb">+</span> <span class="nv">I</span> <span class="nv">I</span><span class="p">)</span> <span class="p">(</span><span class="nb">-</span> <span class="nv">I</span> <span class="nv">I</span><span class="p">)</span>
             <span class="p">(</span><span class="nv">ite</span> <span class="nv">B</span> <span class="nv">I</span> <span class="nv">I</span><span class="p">)))</span>
     <span class="p">(</span><span class="nv">B</span> <span class="nv">Bool</span> <span class="p">((</span><span class="nb">and</span> <span class="nv">B</span> <span class="nv">B</span><span class="p">)</span> <span class="p">(</span><span class="nb">or</span> <span class="nv">B</span> <span class="nv">B</span><span class="p">)</span> <span class="p">(</span><span class="nb">not</span> <span class="nv">B</span><span class="p">)</span>
              <span class="p">(</span><span class="nb">=</span> <span class="nv">I</span> <span class="nv">I</span><span class="p">)</span> <span class="p">(</span><span class="nb">&lt;=</span> <span class="nv">I</span> <span class="nv">I</span><span class="p">)</span> <span class="p">(</span><span class="nb">&gt;=</span> <span class="nv">I</span> <span class="nv">I</span><span class="p">))))</span>
<span class="p">)</span>
<span class="p">(</span><span class="nv">declare-var</span> <span class="nv">x</span> <span class="nv">Int</span><span class="p">)</span>
<span class="p">(</span><span class="nv">declare-var</span> <span class="nv">y</span> <span class="nv">Int</span><span class="p">)</span>
<span class="c1">;; Define the semantic constraints on the function</span>
<span class="p">(</span><span class="nv">constraint</span> <span class="p">(</span><span class="nb">&gt;=</span> <span class="p">(</span><span class="nv">max2</span> <span class="nv">x</span> <span class="nv">y</span><span class="p">)</span> <span class="nv">x</span><span class="p">))</span>
<span class="p">(</span><span class="nv">constraint</span> <span class="p">(</span><span class="nb">&gt;=</span> <span class="p">(</span><span class="nv">max2</span> <span class="nv">x</span> <span class="nv">y</span><span class="p">)</span> <span class="nv">y</span><span class="p">))</span>
<span class="p">(</span><span class="nv">constraint</span> <span class="p">(</span><span class="nb">or</span> <span class="p">(</span><span class="nb">=</span> <span class="nv">x</span> <span class="p">(</span><span class="nv">max2</span> <span class="nv">x</span> <span class="nv">y</span><span class="p">))</span> <span class="p">(</span><span class="nb">=</span> <span class="nv">y</span> <span class="p">(</span><span class="nv">max2</span> <span class="nv">x</span> <span class="nv">y</span><span class="p">))))</span>
<span class="p">(</span><span class="nv">check-synth</span><span class="p">)</span>
</code></pre></div></div>

<h3 id="programming-by-example">Programming by example</h3>

<p>Rather than providing a strict set of requirements the program should adhere to, the generator is tuned on few pairs of <code class="language-plaintext highlighter-rouge">input</code>$\rightarrow$<code class="language-plaintext highlighter-rouge">output</code> as training samples and then the learned pattern is applied to a unpaired input.</p>

<p>One key detail:  the <code class="language-plaintext highlighter-rouge">output</code> in these examples isn’t the actual code itself, it’s rather the <em>result</em> of running the target program, and PS succeeds if the generated program produces the same output of the expected one. This type is named programming by example (PBE) and it is the approach adopted for example by ARC-AGI, an initiative for testing human-like intelligence in software agents<sup id="fnref:2" role="doc-noteref"><a href="#fn:2" class="footnote" rel="footnote">2</a></sup>.</p>

<figure>
<img src="/assets/images/hybrid-ai-generating-programs-media/2c96e91cce80e731cf60a7b993a72fa5b7129cef.png" alt="Example ARC-AGI task" />
<figcaption aria-hidden="true">Example ARC-AGI task</figcaption>
</figure>

<h3 id="input-in-plain-english">Input in plain English</h3>

<p>These inputs may take the form of high-level specifications written in natural language, such a brief description of what the program should do. An example is the <code class="language-plaintext highlighter-rouge">ConCode</code><sup id="fnref:3" role="doc-noteref"><a href="#fn:3" class="footnote" rel="footnote">3</a></sup> dataset, where pairs of description and code allow the training of PS’s generators from English statements. Using natural language for defining programs has been indeed one of the hardest type of input to work with. The problem is that human language is ambiguous and fuzzy, while code needs to be exact, almost mathematical in its structure, as clearly stated by Dijkstra:</p>

<blockquote>
  <p><em>using natural language to specify a program is too imprecise for programs of any complexity<sup id="fnref:4" role="doc-noteref"><a href="#fn:4" class="footnote" rel="footnote">4</a></sup></em></p>
</blockquote>

<p>The lack of rich and complex language models has been a barrier for this type of synthesis, but with LLMs it is possible to capture nuances that were previously out of reach.</p>

<h1 id="domain-specific-language">Domain specific language</h1>

<p>The generated program might be encoded in some general-purpose language that is designed - as the name suggests - to cover a very wide range of tasks. However, PS is usually tailored for narrowed domains where the language expressiveness is not the biggest advantage. This why a domain specific language (DSL) is usually more appropriate.</p>

<p>A DSL brings some other advantages to the process. First of all, a DSL reduces the range of possible solutions a program generator has to search through. DSLs are restricted in the scope and purposes and consequently the search space of possible programs is smaller. In favor of DSL there is also another point: DSLs tend to be more readable for humans. They do not prescribe unnecessary boilerplate code. Rather they keep only the meaningful parts that actually affect what the program will do.</p>

<h1 id="symbolic-vs-statistical-methods">Symbolic vs Statistical methods</h1>

<p>Before going deep down to the methods used for PS, I would distinguish three approaches for PS: fully symbolic, fully neural, and hybrid (symbolic + statistical).</p>

<h3 id="fully-symbolic">Fully symbolic</h3>

<p>The exclusive-symbolic systems relies only procedural algorithms for searching solutions that do not comprise any statistical method. Basically, they do not take into account any learned pattern for speed-up the synthesis. Those methods usually fetch the entire search space and when the problem can’t be narrowed in a small number of possibilities, the task might turns out to be very slow or even unfeasible.</p>

<h3 id="fully-neural">Fully neural</h3>

<p>On the other end, statistical methods usually count on generalizations offered by neural networks, and thanks to their universal function approximation they tentatively map the input to the final program. Those methods have some drawbacks related to the massive amount of data they requires for obtaining convincing results<sup id="fnref:5" role="doc-noteref"><a href="#fn:5" class="footnote" rel="footnote">5</a></sup>.</p>

<p>By their intrinsic nature, purely statistical methods lack of precision that usually is demanded for PS. This is a major issue since the output must run on a formal interpreter. To compensate, post-hoc filtering on multiple program generation is necessary. It consists on generating multiple candidate programs and filter them afterward. Usually it degrades performance with consequent defeat of real-time systems.</p>

<h3 id="hybrid-combine-the-best">Hybrid: combine the best</h3>

<p>An attentive observer may notice that the two paradigms above actually complement each other: while neural networks excel at approximations and fuzzy selections, symbolic methods perform best when nesting and composing primitive operations. So why not combine them by exploiting their strengths and mitigating their weaknesses? Hybrid methods attempt to do just that, but since they are fundamentally different, making them work together is not an easy task.</p>

<p>The integration of the two different ways is usually called neuro-symbolic AI (NeSy) and the boundary where one system ends and the other begins varies significantly on the method adopted by the authors. Generally, an appropriate metaphor for NeSy could be the one that emphasize the duality reason/intuition or the more glamoured dichotomy of Thinking Fast and Slow<sup id="fnref:6" role="doc-noteref"><a href="#fn:6" class="footnote" rel="footnote">6</a></sup>. It basically describes how well the human (slow) reasoning integrates with the (fast) intuitive mental module. Similarly, in PS the symbiosis occurs for facilitating the search of the target program: the symbolic clockwork scans the possibilities while the neural module suggests the <em>intuitions</em> to search more efficiently.</p>

<h1 id="enumerative-algorithms">Enumerative algorithms</h1>

<p>What mainly distinguish symbolic from statistical methods is the presence of enumerative algorithms that represent one of the workhorse of the entire program synthesis. Each time the PS generates a candidate program, it is going to be validated in order to ensure correctness with the DSL syntax and the specifications<sup id="fnref:7" role="doc-noteref"><a href="#fn:7" class="footnote" rel="footnote">7</a></sup>.</p>

<p>If the verification fails and the candidate does not satisfy the specifications, the system has just found a counterexample – an input on which the program produces an incorrect output that could be added in the history of failed trials for improving search on the next iterations. This is basically the process named counter-example guided inductive synthesis (CEGIS) and it is applied not only on PBE but also on specifications settings.</p>

<p>As previously mentioned, the fast and intuitive module is deployed to <em>guide</em> the search of candidates, but learning to search works best when it exploits existing search algorithms already proven useful for<br />
the problem domain of interest - for example, AlphaGo exploits Monte Carlo Tree Search, DeepCoder uses also Satisfiability Modulo Theories (SMT) solvers. Which kind of searches are usually employed? A distinction should be remarked among different approaches I’ve seen in the surveyed methods.</p>

<h3 id="top-down-enumeration">Top-down enumeration</h3>

<p>The top-down paradigm shows how an high-ranking problem could be decomposed in smaller parts that are nested or connected together. In programming, top-down algorithms are the deep first search, it’s companion breath first search or the more generic <a href="https://gfrison.com/2019/06/18/dynamic-programming">dynamic programming</a>. Those algorithms are recursive since they unfold into branches that have a similar structure, but they suffer of a big problem.</p>

<p>Many derived branches ultimately prove irrelevant to the optimal solution. If we can infer which sub-problems are worth of scrutiny, we can significantly reduce computation time. This is why we need more sophisticated way than the ones mentioned above. The <code class="language-plaintext highlighter-rouge">A*</code> search combines backtracking search with heuristics that helps to find the right path<sup id="fnref:8" role="doc-noteref"><a href="#fn:8" class="footnote" rel="footnote">8</a></sup>. Where do heuristics come from? Of course, from a probabilistic learned model. This is the intuition at work! With top-down enumeration, the model could be called just <em>before</em> the search and just <em>once</em>, since its output includes the elements to drive the search without further inquiries.</p>

<p>Because search branches operate independently - a property called <em>Markovian</em> - top-down methods work well for supervised learning. Consider a target output: a program represented as an abstract syntax tree (AST) is essentially a hierarchy of operations that unfold from top to bottom. If we can encode the AST in a neural network, it should somehow reflect inevitably a tree structure we can use to discriminate relevant branches to examine.</p>

<h3 id="bottom-up-enumeration">Bottom-up enumeration</h3>

<p>In contrast, bottom-up searches do not offer the same Markovian characteristic. Bottom-up search builds programs from smaller components, and the <em>correctness</em> or <em>usefulness</em> of a program’s chunk might only become apparent when it’s combined with other segments later in the process.</p>

<p>What might be more appealing in bottom-up settings is that the process of building programs is closely related to the intuition that a human programmer has when he writes small functions first and then combine them to get the desired solution. This differs from the top-down approach, where you start with the big picture and break it down. Instead, bottom-up focuses on solving smaller, manageable problems first and then assembling them into more complex solutions<sup id="fnref:9" role="doc-noteref"><a href="#fn:9" class="footnote" rel="footnote">9</a></sup>.</p>

<p>Because the search starts at the bottom of the AST tree and moves its way up, every sub-program it generates is already executable. At any stage of the search, any sub-problem has always a concrete value attached to it, which helps to assess how well it combines with other code segments<sup id="fnref:10" role="doc-noteref"><a href="#fn:10" class="footnote" rel="footnote">10</a></sup>.</p>

<p>So, how does the model contribute to the search? Let’s talk about some probability-based methods that help those enumerations.</p>

<h1 id="probabilistic-guided-search">Probabilistic guided search</h1>

<p>Most of practitioners agree that heuristics are useful for finding the right solution, and these shortcuts are based also on observed patterns. In particular, it is evident that not all programs are equally likely - some patterns appear more often than others. There definitely is a bias in the way programs are written. For instance, it is self-evident not a single program use both <code class="language-plaintext highlighter-rouge">filter(&gt; x 0)</code> and <code class="language-plaintext highlighter-rouge">filter(&lt;= x 0)</code> - that would be simply absurd.</p>

<p>Another suggestion for inferring which operations may be involved comes from an analysis of the training data labels (on PBE settings). Is there some kind of alignment on the output’s elements, for example a sorting? If so, then most likely there is a sort command in the target program<sup id="fnref:11" role="doc-noteref"><a href="#fn:11" class="footnote" rel="footnote">11</a></sup>.</p>

<h3 id="probabilistic-grammar-tree">Probabilistic grammar tree</h3>

<p>Earlier, I exposed some reasons for using DSLs as target programs. One advantage is that DSLs come with a well-defined context-free grammar<sup id="fnref:12" role="doc-noteref"><a href="#fn:12" class="footnote" rel="footnote">12</a></sup> (CFG) which is a formal way to define how languages are structured. A CFG is defined by a set of:</p>

<ul>
  <li>terminal symbols - those atomic elements that can’t be derived by other elements.</li>
  <li>variables - to be computed by the symbolic engine and rule of productions (non-terminal elements).</li>
  <li>rules of production - the core of the CFG because it is where the generation can unfold the expressiveness of the language and cover myriads of use cases.</li>
</ul>

<p>The rules determine the language’s expressiveness, allowing it to generate countless use cases. Think of it when forming sentences in English: not every word sequence is valid, and a CFG strictly defines which combinations are allowed. This rigidity contrasts with neural models, which learn from examples and work non-deterministically rather than following hard rules.</p>

<p>Probabilistic context-free grammar (pCFG) is an extension of CFG where each rule is associated with a probability that reflects the chance of choosing that particular rule when expanding the program in a sequences of terminals. Those probabilities (or weights) are statistically derived from a series of programs linked to a specific task<sup id="fnref:13" role="doc-noteref"><a href="#fn:13" class="footnote" rel="footnote">13</a></sup>.</p>

<h3 id="attribute-vector">Attribute vector</h3>

<p>Another way to learn heuristics from training data is the approach used in <code class="language-plaintext highlighter-rouge">DeepCoder</code><sup id="fnref:14" role="doc-noteref"><a href="#fn:14" class="footnote" rel="footnote">14</a></sup> where the system uses a vector to represents the probability of programming features present in the target program. The encoded features depends of course on the DSL adopted for the task. For giving some examples, it might be present features such as:</p>

<ul>
  <li>unary math operations - <em>like <code class="language-plaintext highlighter-rouge">abs</code>, <code class="language-plaintext highlighter-rouge">sqrt</code></em>.</li>
  <li>binary arithmetic - <em>like <code class="language-plaintext highlighter-rouge">+</code>, <code class="language-plaintext highlighter-rouge">-</code></em>.</li>
  <li>sequence helper methods - <em>like <code class="language-plaintext highlighter-rouge">head</code>, <code class="language-plaintext highlighter-rouge">tail</code>, <code class="language-plaintext highlighter-rouge">take</code>, <code class="language-plaintext highlighter-rouge">drop</code></em>.</li>
  <li>set functionalities - <em>like <code class="language-plaintext highlighter-rouge">intersection</code>, <code class="language-plaintext highlighter-rouge">union</code>, <code class="language-plaintext highlighter-rouge">diff</code></em>.</li>
  <li>filters - <em>like <code class="language-plaintext highlighter-rouge">&gt;</code>, <code class="language-plaintext highlighter-rouge">&lt;=</code>, <code class="language-plaintext highlighter-rouge">==</code></em>.</li>
  <li>tailored compositions of functions - <em>like <code class="language-plaintext highlighter-rouge">mergesort</code></em>.</li>
</ul>

<p>The (probabilistic) attribute vector is then used to elicit discriminatory decisions when the enumerator searches for viable solutions.</p>

<h3 id="argument-selector">Argument selector</h3>

<p>The two type of statistical models above fit greatly with top-down search since they capture the overall structure of the target program. In contrast, systems such as <code class="language-plaintext highlighter-rouge">CrossBeam</code><sup id="fnref:15" role="doc-noteref"><a href="#fn:15" class="footnote" rel="footnote">15</a></sup> take a bottom-up approach. Instead of reasoning from the top level, the model decides how to combine previously explored operations to build new program components.</p>

<p>As we have seen, not only the programs are kept in history but also their execution result. In this way, in the following iterations those program/value pairs are then used to generate more complex candidates, until the desired solution is found. In this way, promising combinations are prioritized rather then exhaustively enumerate all possibilities.</p>

<h3 id="branch-selector">Branch selector</h3>

<p>In the neural guided deductive search (NGDS)<sup id="fnref:16" role="doc-noteref"><a href="#fn:16" class="footnote" rel="footnote">16</a></sup> the model is intended to rank potential extensions of a partial program. It takes the input-output examples and the current candidate program, then assigns weights to different branches the search algorithm could explore.</p>

<p>At beginning, the candidate program is of course empty, and from there starts the <em>divide and conquer</em> top-down approach by building the solution step-by-step from the root of the AST. When the enumerator reaches a branch, the model suggests the most promising path to follow - guiding the search toward a valid solution.</p>

<h3 id="auxiliary-construction">Auxiliary construction</h3>

<p>What if the neural model doesn’t just help to speed-up the search by giving proper heuristics, but rather it <em>generates</em> new supporting information, helping the symbolic engine to discover a better path to the correct program? This is what has been proposed in Alpha Geometry<sup id="fnref:17" role="doc-noteref"><a href="#fn:17" class="footnote" rel="footnote">17</a></sup> with the purpose of creating proofs to verify geometrical theories<sup id="fnref:18" role="doc-noteref"><a href="#fn:18" class="footnote" rel="footnote">18</a></sup>.</p>

<figure>
<img src="/assets/images/hybrid-ai-generating-programs-media/8b9d002a5ef7ed9a9d71945116da31e23b9e1a30.png" alt="Overview of our neuro-symbolic AlphaGeometry and how the auxiliary construction helps to solve a simple problem" />
<figcaption aria-hidden="true">Overview of our neuro-symbolic AlphaGeometry and how the auxiliary construction helps to solve a simple problem</figcaption>
</figure>

<h1 id="llms-in-the-loop">LLMs in the loop</h1>

<p>Unlike specialized statistical models, LLMs can bring broader cognitive abilities to PS. Though they were designed to generate the next word in a sequence, they have shown surprising reasoning capabilities, the ones that can be valuable on generating code.</p>

<p>Whether LLMs genuinely reason or they just emulates it by retrieving training data is still controversial<sup id="fnref:19" role="doc-noteref"><a href="#fn:19" class="footnote" rel="footnote">19</a></sup>. But for practical purposes, I’m more interested on observing what they can actually do, and integrating them into PS poses new challenges - as any new emerging technology.</p>

<h3 id="lack-of-consistency">Lack of consistency</h3>

<p>A persistent challenge is the lack of consistency and generalization in reasoning behavior. The inconsistency arises when LLMs provide different answers to semantically equivalent input. For instance, simply altering the prompt’s tone - or even adding threats to the request - can significantly steer the LLM’s output. This suggests that LLMs probably don’t have a stable, logical framework guiding their answers.</p>

<p>Crafting effective prompts remains an art (maybe yet one of the fewer exclusive to humans?) and this unpredictability adds another layer of uncertainty to PS. Indeed, this kind of problem is well known to practitioners, and when a problem is familiar, there are usually ways to circumvent it, as those surveyed here.</p>

<p>When LLMs generate programs - and usually they return many wrong solutions - the correct solution is most often in the proximity of the wrong ones, and that, by searching in the neighborhood of the invalid proposals, we may be able to guide the search to find a solution faster by compiling a pCFG as described in a previous section.</p>

<h3 id="context-is-the-king">Context is the king</h3>

<p>LLMs being based on the transformer architecture possess long-term memory that enables to track distant elements in the request. This capability joint with vast ground knowledge that extends the mere narrow scope of the PS’s request, is indeed helpful on determining proper variations on the generated program.</p>

<p>As memory capabilities are present in more simpler neural architecture, in LLMs they surely reach undisputed highs. This aspect affects also the latent-space that governs the network capabilities. LLM’s embeddings incorporate much more information than in older neural architectures. RAG (retrieval augmented generation) is a method to encode vectors for enabling similarity search on very complex structures. Though similarity is indeed used for fuzzy searches, its importance on PS remains marginal due to the more tight requirements for generating programs. Program structures are closer to graph representations that hardly can be generated only by similarity means.</p>

<p>The leverage of high contextualization is well exploited in top-down search methods where an hierarchical view of the final goal gives an important help on splitting high-level problems into tiny ones. On the other hand, this reliance on context can become a drawback when using those powerful models for bottom-up searches. Starting from the big picture and trying to work backward, from basic functions to complex solutions, is much harder.</p>

<h1 id="hybrid-methods">Hybrid methods</h1>

<h3 id="alpha-geometry">Alpha Geometry</h3>

<p>As just previously mentioned, the <em>intuitive hint</em> comes from a neural model based on the transformer architecture. It is initially pre-trained with a large amount of synthetic-generated cases of pure geometrical deductions to create a grounding latent space for geometrical knowledge. Then the model is fine-tuned on a generating auxiliary statements to be added to the initial geometrical proposition<sup id="fnref:20" role="doc-noteref"><a href="#fn:20" class="footnote" rel="footnote">20</a></sup>.</p>

<h3 id="deepcoder">DeepCoder</h3>

<p>Or how to learn to write programs with a NeSy approach. It is based on PBE and it employs a neural network (no LLM) to encode an attribute vector that list the features the final program should include. The search will start by including the most promising features and it will test-and-prune inapplicable programs till it finds a solution that satisfy the examples provided<sup id="fnref:21" role="doc-noteref"><a href="#fn:21" class="footnote" rel="footnote">21</a></sup>.</p>

<h3 id="illm-synth">iLLM-synth</h3>

<p>This method involves prompting the LLM to provide helper functions or syntactic suggestions based on the partially constructed program and any counterexamples encountered (CEGIS) and the provided problem specification (SyGus). The LLM’s feedback can even augment the grammar and update rule weights dynamically as the search progresses. A pCFG is created from LLM’s suggestions and the Symbolic searcher apply an <code class="language-plaintext highlighter-rouge">A*</code> algorithm based on the heuristics calculated in the pCFG. The search stops when a solution is found or a max cost is reached. As <em>cost</em> it is intended as a max depth in the search tree is reached, or the time budget is spent or when the diminishing return of searching does not worth further explorations<sup id="fnref:22" role="doc-noteref"><a href="#fn:22" class="footnote" rel="footnote">22</a></sup>.</p>

<figure>
<img src="/assets/images/hybrid-ai-generating-programs-media/97ae98f64664e5fbe8216db398c72737ac827c6c.png" alt="Overview of iLLM-synth" />
<figcaption aria-hidden="true">Overview of iLLM-synth</figcaption>
</figure>

<h3 id="neural-guided-deductive-search">Neural guided deductive search</h3>

<p>It is a PBE oriented method where a DSL provides the vocabulary of the operations that apply in different parts of the input. The method works by filtering the most likely useful operations (via statistical learning) and then chaining them together recursively, following a functional programming style<sup id="fnref:23" role="doc-noteref"><a href="#fn:23" class="footnote" rel="footnote">23</a></sup>.</p>

<p>Take for example the following problem: “Firstname Lastname” $\rightarrow$ “FL”:</p>

<ul>
  <li>The model’s inference breaks it down into smaller problems by splitting them with the <code class="language-plaintext highlighter-rouge">firstWord</code> and <code class="language-plaintext highlighter-rouge">lastWord</code> functions.</li>
  <li>Those two branches are then computed separately by recursively solving “Firstname” and “Lastname”.</li>
  <li>The model finds the operating sequentially <code class="language-plaintext highlighter-rouge">firstChar</code> to the previous operations might lead to interesting results, that the model again will attempt to <code class="language-plaintext highlighter-rouge">combine</code> them together.</li>
</ul>

<p>If this sounds over-simplified, well, it is. Real-world cases involve more complexity, but this gives the basic idea<sup id="fnref:24" role="doc-noteref"><a href="#fn:24" class="footnote" rel="footnote">24</a></sup>.</p>

<h3 id="crossbeam">Crossbeam</h3>

<p>Differently from previous approaches, the enumerative search follow the bottom-up prescriptions. Starting from the raw input, it applies most likely operations and their result will be used in the following iterations. The search is guided by a neural model trained on program examples. While in NGDS the model gives indications on the AST tree, in Crossbeam only the operations’ output is considered for ranking nested operations. The model learns how to combine evaluated sub-programs in a bottom-up manner<sup id="fnref:25" role="doc-noteref"><a href="#fn:25" class="footnote" rel="footnote">25</a></sup>.</p>

<h3 id="hysynth">HySynth</h3>

<p>Differently from other methods, LLMs are prompted first to guess what will be the program for a PBE problem. Candidate programs are used to compile a pCFG which will then guide the following bottom-up search.</p>

<figure>
<img src="/assets/images/hybrid-ai-generating-programs-media/0fb9a685b4cf3d9ec2406a9bb12817d1db2fd4c6.png" alt="Overview of HySynth" />
<figcaption aria-hidden="true">Overview of HySynth</figcaption>
</figure>

<p>The search method falls under dynamic programming, meaning it builds the program from its smaller parts step by step, assigning a computational cost at each new chunk of code. This cost reflects the resources needed to execute that part, as well as the search process itself and it is used for filtering out unsuitable candidates. The DP-based search stores intermediate results to avoid redundant calculations, improving efficiency by skipping explored sub-programs<sup id="fnref:26" role="doc-noteref"><a href="#fn:26" class="footnote" rel="footnote">26</a></sup>.</p>

<h1 id="conclusion">Conclusion</h1>

<p>In this brief overview I explore how different methods for PS can be combined in flexible way. Various approaches have been tested to find simple and effective ways to generate programs. Pure statistical methods, especially when they are based on LLMs, might provide simple methods that do not requires combinations of techniques but rather they rely on the universality of their broad context they can leverage for producing code.</p>

<p>Usually, fully neural approaches are effective when problems are simple or LLMs can compensate the lack of explicit knowledge in the input, or they can simply exploit massive amount of publicly available programming code. When none of the above conditions applies, LLMs are beneficial when paired with enumerative processes by guiding the search towards more plausible candidates. The intersection between symbolic and neural systems lies where probabilities may have a role on compiling heuristics. In particular, the flat semantic of a DSL can be more valuable for search when enriched with pCFG. <em>author: <a href="https://gfrison.com">Giancarlo Frison</a></em></p>

<div class="footnotes" role="doc-endnotes">
  <ol>
    <li id="fn:1" role="doc-endnote">
      <p>SyGuS-Org. (n.d.). <em>SyGuS language</em>. SyGuS. Retrieved from <a href="https://sygus-org.github.io/language/">https://sygus-org.github.io/language/</a> <a href="#fnref:1" class="reversefootnote" role="doc-backlink">&#8617;</a></p>
    </li>
    <li id="fn:2" role="doc-endnote">
      <p>Chollet, F., Knoop, M., Kamradt, G., &amp; Landers, B. (2024). ARC Prize 2024: Technical Report. arcprize.org <a href="#fnref:2" class="reversefootnote" role="doc-backlink">&#8617;</a></p>
    </li>
    <li id="fn:3" role="doc-endnote">
      <p>Soliman, A. S. (n.d.). <em>CodeXGLUE-CONCODE dataset</em>. Hugging Face. Retrieved from <a href="https://huggingface.co/datasets/AhmedSSoliman/CodeXGLUE-CONCODE">https://huggingface.co/datasets/AhmedSSoliman/CodeXGLUE-CONCODE</a> <a href="#fnref:3" class="reversefootnote" role="doc-backlink">&#8617;</a></p>
    </li>
    <li id="fn:4" role="doc-endnote">
      <p>Dijkstra, E.W. <em>On the foolishness of “natural language programming”</em>; https://bit.ly/3V5ZP5 <a href="#fnref:4" class="reversefootnote" role="doc-backlink">&#8617;</a></p>
    </li>
    <li id="fn:5" role="doc-endnote">
      <p>Moreover, in many cases those datasets are synthetically generated by the same models adopted for reversing them into programs. Unfortunately, it introduces the self-selection bias, the kind of problem where the training data is not actually a reliable sample of real-case scenarios <a href="#fnref:5" class="reversefootnote" role="doc-backlink">&#8617;</a></p>
    </li>
    <li id="fn:6" role="doc-endnote">
      <p><strong>Kahneman, D. (2011).</strong> <em>Thinking, fast and slow</em>. Farrar, Straus and Giroux. <a href="#fnref:6" class="reversefootnote" role="doc-backlink">&#8617;</a></p>
    </li>
    <li id="fn:7" role="doc-endnote">
      <p>This verification should not cause significant performance degradation, as the verifier is symbolic and typically the same component that will execute the generated artifact once deployed in the target system. Unless the verification involves costly data access, we can confidently assume it takes negligible resources. <a href="#fnref:7" class="reversefootnote" role="doc-backlink">&#8617;</a></p>
    </li>
    <li id="fn:8" role="doc-endnote">
      <p>Li, Y., Parsert, J., &amp; Polgreen, E. (2024). Guiding enumerative program synthesis with large language models. Proc. ACM Program. Lang., 8(POPL). <a href="#fnref:8" class="reversefootnote" role="doc-backlink">&#8617;</a></p>
    </li>
    <li id="fn:9" role="doc-endnote">
      <p>bottom-up search is similar to mathematical factorization, which refers to the process of breaking down a number into smaller factors. Given a non-prime number $n$ then $n = a * b$ where $a,b$ are factors of $n$. Important note: it’s easy to multiply, but hard to factor. <a href="#fnref:9" class="reversefootnote" role="doc-backlink">&#8617;</a></p>
    </li>
    <li id="fn:10" role="doc-endnote">
      <p>Shi, K., Dai, H., Ellis, K., &amp; Sutton, C. (2022). CROSSBEAM: Learning to search in bottom-up program synthesis . arXiv preprint arXiv:2203.10452 <a href="#fnref:10" class="reversefootnote" role="doc-backlink">&#8617;</a></p>
    </li>
    <li id="fn:11" role="doc-endnote">
      <p>Balog, M., Gaunt, A.L., Brockschmidt, M., Nowozin, S., &amp; Tarlow, D. (2016). DeepCoder: Learning to Write Programs. <em>ArXiv, abs/1611.01989</em>. <a href="#fnref:11" class="reversefootnote" role="doc-backlink">&#8617;</a></p>
    </li>
    <li id="fn:12" role="doc-endnote">
      <p>The term “context-free” means that the rules for creating a valid sentence depend only on the individual symbols themselves, not on their surrounding elements. Imagine building a sentence with blocks; each block has a specific role regardless of where it sits in the final structure. <a href="#fnref:12" class="reversefootnote" role="doc-backlink">&#8617;</a></p>
    </li>
    <li id="fn:13" role="doc-endnote">
      <p>Those programs might be given training samples but also generated programs as we will see specifically with LLMs adoption. <a href="#fnref:13" class="reversefootnote" role="doc-backlink">&#8617;</a></p>
    </li>
    <li id="fn:14" role="doc-endnote">
      <p>Balog, M., Gaunt, A.L., Brockschmidt, M., Nowozin, S., &amp; Tarlow, D. (2016). DeepCoder: Learning to Write Programs. <em>ArXiv, abs/1611.01989</em>. <a href="#fnref:14" class="reversefootnote" role="doc-backlink">&#8617;</a></p>
    </li>
    <li id="fn:15" role="doc-endnote">
      <p>Shi, K., Dai, H., Ellis, K., &amp; Sutton, C. (2022). CROSSBEAM: Learning to search in bottom-up program synthesis . arXiv preprint arXiv:2203.10452 <a href="#fnref:15" class="reversefootnote" role="doc-backlink">&#8617;</a></p>
    </li>
    <li id="fn:16" role="doc-endnote">
      <p>Vijayakumar, A. K., Batra, D., Mohta, A., Jain, P., Polozov, O., &amp; Gulwani, S. (2018). Neural-guided deductive search for real-time program synthesis from examples. In Advances in Neural Information Processing Systems (pp. Not specified in the excerpt).1 Retrieved from https://arxiv.org/abs/1804.01186 <a href="#fnref:16" class="reversefootnote" role="doc-backlink">&#8617;</a></p>
    </li>
    <li id="fn:17" role="doc-endnote">
      <p>Trinh, T. H., Wu, Y., Le, Q. V., He, H., &amp; Luong, T. (2024). Solving olympiad geometry without human demonstrations. Nature. https://doi.org/10.1038/s41586-023-06747-5 <a href="#fnref:17" class="reversefootnote" role="doc-backlink">&#8617;</a></p>
    </li>
    <li id="fn:18" role="doc-endnote">
      <p>The reader might find mathematical proofs a bit out of scope with the topic of this dissertation, but the Curry-Howard correspondence equates propositions with types and proofs with programs. That means when we are talking of geometrical validity proofs we are actually describing programs that derives consequences from axioms to the target state <a href="#fnref:18" class="reversefootnote" role="doc-backlink">&#8617;</a></p>
    </li>
    <li id="fn:19" role="doc-endnote">
      <p>Kambhampati, S., Valmeekam, K., Guan, L., Stechly, K., Verma, M., Bhambri, S., Saldyt, L., &amp; Murthy, A. (2024). <em>LLMs can’t plan, but can help planning in LLM-Modulo frameworks</em>. arXiv. https://arxiv.org/abs/2402.01817 <a href="#fnref:19" class="reversefootnote" role="doc-backlink">&#8617;</a></p>
    </li>
    <li id="fn:20" role="doc-endnote">
      <p>Trinh, T. H., Wu, Y., Le, Q. V., He, H., &amp; Luong, T. (2024). Solving olympiad geometry without human demonstrations. Nature. https://doi.org/10.1038/s41586-023-06747-5 <a href="#fnref:20" class="reversefootnote" role="doc-backlink">&#8617;</a></p>
    </li>
    <li id="fn:21" role="doc-endnote">
      <p>Balog, M., Gaunt, A.L., Brockschmidt, M., Nowozin, S., &amp; Tarlow, D. (2016). DeepCoder: Learning to Write Programs. <em>ArXiv, abs/1611.01989</em>. <a href="#fnref:21" class="reversefootnote" role="doc-backlink">&#8617;</a></p>
    </li>
    <li id="fn:22" role="doc-endnote">
      <p>Li, Y., Parsert, J., &amp; Polgreen, E. (2024). Guiding enumerative program synthesis with large language models. Proc. ACM Program. Lang., 8(POPL). <a href="#fnref:22" class="reversefootnote" role="doc-backlink">&#8617;</a></p>
    </li>
    <li id="fn:23" role="doc-endnote">
      <p>Though it appears to be more a bottom-up, it is still top-down because the process is driven by attempting to satisfy the overall goal of transforming the input to the target output by recursively applying rules from the DSL. <a href="#fnref:23" class="reversefootnote" role="doc-backlink">&#8617;</a></p>
    </li>
    <li id="fn:24" role="doc-endnote">
      <p>Vijayakumar, A. K., Batra, D., Mohta, A., Jain, P., Polozov, O., &amp; Gulwani, S. (2018). Neural-guided deductive search for real-time program synthesis from examples. In Advances in Neural Information Processing Systems (pp. Not specified in the excerpt).1 Retrieved from https://arxiv.org/abs/1804.01186 <a href="#fnref:24" class="reversefootnote" role="doc-backlink">&#8617;</a></p>
    </li>
    <li id="fn:25" role="doc-endnote">
      <p>Shi, K., Dai, H., Ellis, K., &amp; Sutton, C. (2022). CROSSBEAM: Learning to search in bottom-up program synthesis . arXiv preprint arXiv:2203.10452 <a href="#fnref:25" class="reversefootnote" role="doc-backlink">&#8617;</a></p>
    </li>
    <li id="fn:26" role="doc-endnote">
      <p>Barke, S., Gonzalez, E. A., Kasibatla, S. R., Berg-Kirkpatrick, T., &amp; Polikarpova, N. (2024). HYSYNTH Context-Free LLM Approximation for guiding program synthesis. <a href="#fnref:26" class="reversefootnote" role="doc-backlink">&#8617;</a></p>
    </li>
  </ol>
</div>]]></content><author><name>Giancarlo Frison</name></author><category term="hybrid-ai" /><category term="program-synthesis" /><category term="programming" /><category term="neural-symbolic" /><category term="llm" /><summary type="html"><![CDATA[Computer programming is a specialized activity that requires long training and experience to match productivity, precision and integration. It hasn't been a secret for AI practitioners to ultimately create software tools that can facilitate the role of programmers. The branch of AI dedicated to automatically generate programs from examples or some sort of specification is called program synthesis. In this dissertation, I'll explore different methods to combine symbolic AI and neural networks (like large language models) for automatically create programs. The posed question is: How AI methods can be integrated for helping to synthesize programs for a wide range of applications?]]></summary></entry><entry><title type="html">Rhetoric LLMs and Argumentation</title><link href="https://gfrison.com/2024/12/01/rhetoric-llms-argumentation" rel="alternate" type="text/html" title="Rhetoric LLMs and Argumentation" /><published>2024-12-01T00:00:00+01:00</published><updated>2024-12-01T00:00:00+01:00</updated><id>https://gfrison.com/2024/12/01/argumentation-llms-rhetoric</id><content type="html" xml:base="https://gfrison.com/2024/12/01/rhetoric-llms-argumentation"><![CDATA[<p>Since the advent of deep neural transformer architecture - a portfolio of techniques aim to correlate long distant semantic dependencies among words -  large language models (LLMs) have shown an unprecedented ability on a multitude of tasks. <br />
Of course, by initial purpose of the design, the main task is the mastery of human language. Argumentation is a linguistic exercise among two or more persons intended to convey some sort of belief from one participant to another. <br />
LLMs linguistic skills can be directed towards a better argumentation, focusing on rhetoric and persuasive means.</p>

<p class="notice--primary">For more logic based argumentation, <a href="/2024/12/01/defeasible-logic-automatic-argumentation">here a deep dive</a> into the world of defesabile logic.</p>

<p>Adopting neural processes in argumentative tasks is not a fresh idea. Usually there have been initiatives for <strong>argumentation mining</strong> - extraction of latent parts (claim, warrant, baking, etc.) from  raw text - a kind of process similar to named entity recognition for example. Another task where deep learning have found applicability was <strong>argument generation</strong> which is reversal in the respect of mining. The generation phase transform a given argument’s structure into coherent and fluent paraphrases.</p>

<p>LLMs are quite effective in those tasks without particularly laborious setup. I’m referring to the capability of solving those tasks with minimal (few-shot learning) or even without (zero-shot learning) any sample on how to accomplish the required work.</p>

<p>Though LLMs have outstandingly raised the bar of <em>machine intelligence</em> - whatever meaning could be pointed to the terminology - they are affected by the collateral effects of their own designing.</p>

<p>The fundamental LLMs’ feature is about predicting what comes next the last token (word or any symbol), the so-called <strong>auto-regressive</strong> property of generative models that consists of recursively iterate over what has been already generated. This enable the generation of fluent text and the answering to an incredible variety of questions. The next-token prediction lays on the scale a minimalist process (thanks to the transformer architecture) to an enormous amount of data. But auto-regression leads also to <strong>hallucinations</strong> - generated chunks not aligned with the reality.</p>

<figure style="width: 400px" class="align-center">
  <img src="/assets/images/argue-llm.webp" alt="Arguing LLM" />
</figure>

<p>Those fictitious answers are a collateral effect of the LLM design. Basically, a LLM <em>always</em> hallucinates<sup id="fnref:subbarao" role="doc-noteref"><a href="#fn:subbarao" class="footnote" rel="footnote">1</a></sup>. Sometimes the generated output is sound and valid, sometimes it is just a well-assembled bunch of fantasies. One way to mitigate generation of statements not adherent to our world is the adoption of the <strong>chain of thought</strong> (CoT), a technique that is also adopted for argumentation, as we see next.</p>

<h3 id="argumentation-step-by-step">argumentation step by step</h3>
<p>CoT is a technique used to encourage an LLM generating higher quality answers. As previously mentioned, an LLM generates a series of words (tokens) according to what submitted (or self-produced) and the corpus on which the model has been trained. If the request implies a concatenation of <a href="https://community.sap.com/t5/artificial-intelligence-and-machine-learning/cracking-the-knowledge-code-hybrid-ai-for-matching-information-systems-and/m-p/13711568">reasoning steps not <em>explicitly</em> expressed</a>, the generative part will jump to the conclusions with the obvious risk to fail on sequencing a correct chain of reasoning, causing diversions and utterly wrong answers.</p>

<p>This approach is remarkably similar to what we do every time we mentally assess a complex argument with several check-points. This problem is mitigated by instructing on explicitly tracking the logical path from premises to conclusion. The goal of CoT is to divide a problem in small chunks and facilitate a reasoning in smaller steps, where the effort to connect premises and conclusions is much lower than the original (bigger) argument.</p>

<h3 id="multiagent-automated-debate">multiagent automated debate</h3>
<p>The CoT is even more effective when it is augmented not by simply instructing the LLM to take care of the intermediate steps, but by arguing on each intermediate conclusion with a <strong>multiagent debate</strong><sup id="fnref:du" role="doc-noteref"><a href="#fn:du" class="footnote" rel="footnote">2</a></sup>. What could be the opponent of an LLM agent? Of course, another LLM agent. This approach is inspired by the concept of <em>“The Society of Mind”</em><sup id="fnref:minsky" role="doc-noteref"><a href="#fn:minsky" class="footnote" rel="footnote">3</a></sup> and the core idea is to have multiple LLMs independently generating arguments. The indipendent agents will critique and refine their responses based on the arguments and reasoning presented by the others, for converging eventually on a single, more accurate conclusion.</p>
<figure style="width: 400px" class="align-center">
  <img src="/assets/images/multi-debate.png" alt="Multi-debate LLM" />
  <caption>Peformance gap in multi-debating agents (Du, Yilun, et at. 2023)</caption>
</figure>

<p>The multiagent debate is mainly focused on unfolding a particular reasoning path while the focus of the task is <em>not</em> on searching <em>alternative</em> ways of <del>thinking of</del> assessing conclusions. Which is the point of the next paragraph.</p>

<h3 id="self-consistency-and-contrastive-argumentation">self-consistency and contrastive argumentation</h3>
<p>Consistency is a crucial aspect in argumentation and it is scrupulously controlled in logic frameworks like the <a href="/2024/12/01/defeasible-logic-automatic-argumentation">ones previously depicted</a>. The probabilistic nature of LLMs and their intrinsic hallucinating mechanisms may often clash with this important value, and without specific mechanisms to promote self-consistency, LLMs may produce conflicting outputs.</p>

<p>A variation of CoT that aims to boost (according to the authors) the performance of LLMs is a strategy that leverages the idea of generating multiple reasoning paths<sup id="fnref:contrastive" role="doc-noteref"><a href="#fn:contrastive" class="footnote" rel="footnote">4</a></sup> and then a self-consistency check could identify contradictory claims, discarding them and selecting only conflict-free candidates, or it will attempt to solve them by inspecting additional steps.</p>

<p>Self-consistency may be reinforced by a technique that help the LLM to discriminate good reasoning path from the bad ones. The <strong>contrastive argumentation</strong> works by providing sample of claims with good and bad conclusions with related explanations such as: if all dogs are mammals and all mammals have fur, does it follows that all dogs have fur? i.e.: $?has(dog,fur)$<br />
\(\begin{eqnarray}
has(x,fur) &amp;\Leftarrow&amp; is(x,mammal)\\
is(x,mammal) &amp;\leftarrow&amp; is(x,dog)
\end{eqnarray}\)</p>

<ul>
  <li>Correct Answer Explanation: <em>Yes</em>, since all dogs are mammals and all mammals have fur, it logically follows that all dogs must have fur.</li>
  <li>Incorrect Answer Explanation: <em>No</em>, some dogs might be hairless. (The incorrect explanation ignores the stated premises and introduces an exception not supported by the given information).</li>
</ul>

<h2 id="automated-rhetoric">automated rhetoric</h2>
<p>So far, we have seen the building blocks of the logic behind argumentation, which is addressed by the initiatives in automated argumentation as discussed in previous sections. The advent of LLMs offers new opportunities on addressing communicative aspects such as rhetoric, and enhance argumentation within human language.</p>

<p>Rhetoric is the study of effective speaking and writing for what it relates to persuasion. It requires understanding the distinction between the <strong>content</strong> and the <strong>form</strong> of communication. Rhetoricians emphasize their interdependence, highlighting the connection between language and meaning, argument and ornament, thought and its expression. This means linguistic forms are fundamental, not just to persuasion, but to thought itself.</p>

<h3 id="content-topics-of-invention"><em>content</em>: topics of invention</h3>
<p>The topics of invention (TI) are categories of thought that can be used to generate arguments. They are like lenses through which you can view the claim, the baking and even the evidence. They are used in combination to create a strong and persuasive argument. For example, you might start by defining the problem, then explore its <strong>causes and effects</strong>. You could then propose a solution, <strong>compare</strong> and <strong>contrast</strong> it to other solutions, make <strong>analogies</strong> with other arguments and cite expert <strong>testimony</strong> to support your claims. Finally, you could address potential objections and refute them using counterarguments<sup id="fnref:elephant" role="doc-noteref"><a href="#fn:elephant" class="footnote" rel="footnote">5</a></sup>.</p>

<ul>
  <li><strong>Cause and Effect:</strong> This involves exploring the causes of a problem or the effects of a proposed solution.</li>
  <li><strong>Comparison and Contrast:</strong> This involves comparing and contrasting different ideas, arguments, or proposals.</li>
  <li><strong>Definition:</strong> This involves defining key terms and concepts in the argument.</li>
  <li><strong>Testimony:</strong> This involves citing the opinions of experts or authorities to support your argument.</li>
  <li><strong>Analogy:</strong> This involves drawing comparisons between different things to illustrate a point.</li>
  <li><strong>Generalization:</strong> This involves making broad statements based on specific examples or evidence.</li>
  <li><strong>Division and Partition:</strong> This involves breaking down a complex issue into smaller, more manageable parts.</li>
  <li><strong>Concession and Refutation:</strong> This involves acknowledging the opposing side’s arguments and then refuting them.</li>
</ul>

<h3 id="form-figures-of-speech"><em>form</em>: figures of speech</h3>
<p>It is subset of rhetorical devices that specifically use language in creative ways to convey meaning and evoke emotions. Rhetorical devices are techniques used to enhance communication and persuasion, and figures of speech are specific tools within this broader category.</p>
<ul>
  <li><strong>Simile</strong>: A comparison using “like” or “as.” <em>Example:</em> “Her eyes are <em>like</em> stars.”</li>
  <li><strong>Metaphor</strong>: A direct comparison without using “like” or “as.” <em>Example:</em> “He is a <em>lion</em> in battle.”</li>
  <li><strong>Personification</strong>: Giving human qualities to non-human things. <em>Example:</em> “The wind <em>whispered</em> through the trees.”</li>
  <li><strong>Hyperbole</strong>: Exaggeration for emphasis.   <em>Example:</em> “I’m so hungry, I could eat a horse.”</li>
  <li><strong>Understatement</strong>: Deliberately saying less than what is meant. <em>Example:</em> “It’s a bit chilly outside” (when it’s freezing).</li>
  <li><strong>Oxymoron</strong>: Combining contradictory terms. <em>Example:</em> “Bittersweet,” “jumbo shrimp”</li>
  <li><strong>Pun</strong>: A play on words, often humorous. <em>Example:</em> “I’m reading a book about anti-gravity. It’s impossible to put down.”</li>
</ul>

<h3 id="how-much-are-llms-persuasive">how much are LLMs persuasive?</h3>
<p>As one could expect from a training with a big slice of the entire human-written corpus, LLM can adopt the most effective and persuasive rhetoric strategies. Have ever opened a random prompt for a practical task? Usually they begin with <em>“You are a X and you are demanded to…“</em> where <em>X</em> could be a professional figure or any kind of specifically behavioral-driven agent that the LLM should impersonate, in order to run the task with a very specific taste. Not only the adaptability to different prompts, but also the control over persuasive strategies is totally mastered. For example, LLMs push forward on moral values - especially on negative moral bias. This could be due maybe to the fact that we as humans are more sensitive to negative outcomes rather the positive ones, and LLM may emphasize more on bad effects of a particular action or policy, appealing to the audience’s desire to avoid causing harm.</p>

<p>LLM seems to use more complex grammatical and lexical sentences, probably because human readers may interpret the need for greater cognitive effort as an indicator of the argument’s significance or substance, despite the common notion that that simpler, easier-to-process arguments are inherently more persuasive<sup id="fnref:carrasco" role="doc-noteref"><a href="#fn:carrasco" class="footnote" rel="footnote">6</a></sup>.</p>

<h1 id="conclusion">Conclusion</h1>
<p>This is a really brief introduction on how argumentation is an interdisciplinary subject that is tackled from many different angles and perspectives. It triggers interests of a very diverse set of domains such as philosophy, logic, symbolic and neural processing, but also cognitive sciences and social matters. Each field of research move forward its own contribution that cover some aspects of the vast panorama that argumentation involves. Though, it is important to summarize them and synthesize some converging points where different fields can contribute to each other. For example the integration of logic soundness and validity into rhetorical devices - for enabling a more precise dialectic process - is an open issue that roots into symbolic-neural dualism that might look very promising from the point of view of automated argumentation, but that it is still far from very effective applicability.</p>

<hr />
<div class="footnotes" role="doc-endnotes">
  <ol>
    <li id="fn:subbarao" role="doc-endnote">
      <p>Kambhampati, Subbarao, et al. “LLMs can’t plan, but can help planning in LLM-modulo frameworks.” <em>arXiv preprint arXiv:2402.01817</em> (2024). <a href="#fnref:subbarao" class="reversefootnote" role="doc-backlink">&#8617;</a></p>
    </li>
    <li id="fn:du" role="doc-endnote">
      <p>Du, Yilun, et al. “Improving factuality and reasoning in language models through multiagent debate.” <em>arXiv preprint arXiv:2305.14325</em> (2023). <a href="#fnref:du" class="reversefootnote" role="doc-backlink">&#8617;</a></p>
    </li>
    <li id="fn:minsky" role="doc-endnote">
      <p>M. Minsky. Society of mind. Simon and Schuster, 1988. <a href="#fnref:minsky" class="reversefootnote" role="doc-backlink">&#8617;</a></p>
    </li>
    <li id="fn:contrastive" role="doc-endnote">
      <p>Chia, Yew Ken, et al. “Contrastive chain-of-thought prompting.” <em>arXiv preprint arXiv:2311.09277</em> (2023). <a href="#fnref:contrastive" class="reversefootnote" role="doc-backlink">&#8617;</a></p>
    </li>
    <li id="fn:elephant" role="doc-endnote">
      <p>Proceedings of the 4th Workshop on Figurative Language Processing (FLP), pages 45–52June 21, 2024 ©2024 Association for Computational Linguistics - The Elephant in the Room: Ten Challenges of Computational <a href="https://aclanthology.org/2024.figlang-1.6.pdf">Detection of Rhetorical Figures</a>. <a href="#fnref:elephant" class="reversefootnote" role="doc-backlink">&#8617;</a></p>
    </li>
    <li id="fn:carrasco" role="doc-endnote">
      <p>Carrasco-Farre, Carlos. “Large Language Models are as persuasive as humans, but how? About the cognitive effort and moral-emotional language of LLM arguments.” <em>arXiv preprint arXiv:2404.09329</em> (2024). <a href="#fnref:carrasco" class="reversefootnote" role="doc-backlink">&#8617;</a></p>
    </li>
  </ol>
</div>]]></content><author><name>Giancarlo Frison</name></author><category term="argumentation" /><category term="LLM" /><summary type="html"><![CDATA[How much are LLMs persuasive? Here a brief excursus on how automatic argumentation can benefit from LLMs linguistic skills]]></summary></entry><entry><title type="html">Defeasible Logic for Automatic Argumentation</title><link href="https://gfrison.com/2024/12/01/defeasible-logic-automatic-argumentation" rel="alternate" type="text/html" title="Defeasible Logic for Automatic Argumentation" /><published>2024-12-01T00:00:00+01:00</published><updated>2024-12-07T00:00:00+01:00</updated><id>https://gfrison.com/2024/12/01/defeasible-logic-automatic-argumentation</id><content type="html" xml:base="https://gfrison.com/2024/12/01/defeasible-logic-automatic-argumentation"><![CDATA[<p>You can’t escape arguments. They represent the fabric of world understanding at any level of resolution you want to interpreter the reality. There is no choice or decision we make without supporting or confuting other choices advanced from others or even from ourselves. Arguments are everywhere.</p>
<figure style="width: 400px" class="align-center">
  <img src="/assets/images/argumentation-clinic.png" alt="Argumentation Clinic" />
  <figcaption>Monty Python's Argumentation Clinic</figcaption>
</figure>
<p>In a famous old television series, a patient enters a clinic for having an argument but the specialist clear the demand by simply denying whatever patient says. For that, the patient complains that a mere denial of a proposition is not an argument. <em>“An argument - my fellow doctor - is a connected series of propositions that are intended to support or refute some other kind of proposition!”</em>. The client shouts. <br />
Which is not so different from:</p>
<blockquote>
  <p><em>“Argumentation is a verbal, social, and rational activity aimed at convincing a reasonable critic of the acceptability of a standpoint by putting forward a constellation of propositions justifying or refuting the proposition expressed in the standpoint<sup id="fnref:eemeren" role="doc-noteref"><a href="#fn:eemeren" class="footnote" rel="footnote">1</a></sup>.”</em></p>
</blockquote>

<p><a href="/2022/09/01/argumentation-ecommerce-semantics">Argumentation</a> is inherently <em>linguistic</em> - in spoken or in written form - and it is a <em>social</em> activity as an interaction within two or more participants with different opinions. Without contested idea, argumentation would not take place. Having an “argument” implies responding to one’s another claim for supporting, attacking or defending positions. Argumentation is mainly a <em>rational</em> activity as an exchange of reasonable arguments, though <strong>rhetoric</strong> tactics can significantly influence the persuasive power of an arguer. <br />
The nuanced spectrum of rhetoric can hardly be caught by reasoning and logical processes, but they could be resorted by other means. I’m referring in particular to large language models that they can tailor an argument for a particular audience.</p>

<p class="notice--primary">Want to read more about it? Take a look into <a href="/2024/12/01/rhetoric-llms-argumentation">LLMs rhetorical skills</a></p>

<p>The latter point is crucial for depicting argumentation. Argumentation investigates the communicative aspects of reasoning and can be described as an interplay between logic and rhetoric. While rhetoric is a communication exercise to make an argument more or less appealing to the recipient, logic is a philosophical seek that concern that <em>soundness</em> of a proposition rather than championing persuasion as primary goal.</p>

<p>The first introduction to a systematic theory of argumentation comes from Dung and his abstract argumentation framework<sup id="fnref:dung" role="doc-noteref"><a href="#fn:dung" class="footnote" rel="footnote">2</a></sup>, where arguments are nodes in graph. An argumentation system is defined as $S=\langle A,R\rangle$ where $A$ is the argument and $R$ is the <strong>attack</strong> relation with other arguments. In the example below, $c$ is <strong>defeated</strong> both by $d$ and $b$. On the other hand, $b$ is defeated by $a$ which neutralizes the attack of $b$ towards $c$. According to the framework, an argument is <strong>undefeated</strong> if it has no attacks towards it, or if attackers are already all defeated. As the most casual reader might notice, the computational load required to check whether a statement is defeated or not depends on the size of graph, but <em>“Hey man, this is why the computers stand for!”</em>. It’s then beneficial to use automated means to reach a consensus.</p>
<figure style="width: 400px" class="align-center">
  <img src="/assets/images/dung2.png" alt="Dung framework" />
</figure>

<h3 id="structured-argumentation">structured argumentation</h3>
<p>In abstract argumentation every argument is regarded as atomic. Dung’s framework does not dissect arguments into their constituent parts, and that could be seen as an overly simplification. This is why we can turn into <strong>structured argumentation</strong>. In this way, the different parts that compose an argument are declared explicitly.</p>
<blockquote>
  <p>An argument is then said to be structured in the sense that normally the premises and claim of the argument are made explicit, and the relationship between the premises and claim is formally defined<sup id="fnref:besnard" role="doc-noteref"><a href="#fn:besnard" class="footnote" rel="footnote">3</a></sup>.</p>
</blockquote>

<p>What are then the constituent parts of an an argument? Toulmin<sup id="fnref:toulmin" role="doc-noteref"><a href="#fn:toulmin" class="footnote" rel="footnote">4</a></sup> has traced a list that could be grossly summarized with:</p>
<ul>
  <li><strong>Claim</strong>. It is an <a href="https://en.wikipedia.org/wiki/Performative_utterance"><em>explicit performative</em></a> at the core of the argument. It’s the conclusion that needs to be proved (claimed) and stated to other participants.</li>
  <li><strong>Data</strong>. These are the evidences available that trigger a validation or a refusal of a claim based on some warrants.</li>
  <li><strong>Reason</strong>. It could be an <em>explanation</em> with the intent to tell why a claim is true. It could be a <em>justification</em> with the purpose to make a conclusion believable.</li>
  <li><strong>Warrant</strong>. It is an implicit assumption that a reason is justifiably true. In the argument <em>“doing regularly sport helps to loose weight”</em> implies that loosing weight (of course, for those overweight) is a good thing. It is the final consequence that a supporting reason or evidence make the claim truly believable.</li>
  <li><strong>Backing</strong>. There is plenty of scientific information that confirm low-weighted people can benefit of better health condition much longer than heavy-weight ones.</li>
  <li><strong>Qualifier</strong>. Help to bound the claim statement in a scope on which it’s validity can be more successfully validated. Such as <em>“most of the people will benefit”</em>, does not say <em>all</em> in order to not be easily confuted by statistical outcomes.</li>
</ul>

<p>The approach adopt by Toulmin depart a bit from the usual mathematical form of logic. According to him, the geometrical approach to argumentation is flawed to accommodate the complexity of real-world cases - especially those involving empirical evidence - and fails to capture the context-dependencies of an argument. The list we see above reflects more a <em>jurisprudential</em><sup id="fnref:toulmin:1" role="doc-noteref"><a href="#fn:toulmin" class="footnote" rel="footnote">4</a></sup> rather than methematical approach. Real-world argumentation resemble legal cases more closely than math proofs. Most of arguments do not follow scrupulously the analytical syllogism <em>“Socrates is human, all humans are mortal, therefore Socrates is mortal”</em>. Claims are rather supported by evidence, the warrant’s backing, and the possibility of rebuttals.</p>

<h3 id="explanations">explanations</h3>
<p>Explanations aim to transfer the understanding of how the warrant status of a particular argument was obtained from a given argumentation framework. Walton<sup id="fnref:walton" role="doc-noteref"><a href="#fn:walton" class="footnote" rel="footnote">5</a></sup> has studied arguments and explanations from a philosophical point of view. He considers that the purpose of an argument is to get the hearer to come to accept something that is doubtful or unsettled, whereas the purpose of an explanation is to get the hearer to understand something that he already accepts as a fact.</p>
<blockquote>
  <p>Explaining consists in exposing something in such a way that it is understandable for the receiver of the explanation – so that he/she improves his/her knowledge about the object of the explanation – and satisfactory in that it meets the receiver’s expectations<sup id="fnref:acave" role="doc-noteref"><a href="#fn:acave" class="footnote" rel="footnote">6</a></sup>.</p>
</blockquote>

<h3 id="practical-applications">practical applications</h3>
<p>Argumentation is an art. As an art, argument has techniques and general principles, therefore it is a learned craft. Although there are suggested guidelines and argumentative tools, there is no science of argument. So why automatize it? <br />
As we will see shortly, certain mechanics mostly related to logical reasoning could be intuitively automatized for handling complex issues with multiple perspectives and contrasting viewpoints. I’m referring to applications in the realm of <strong>legal systems</strong>, take for example how to check the consistency of the immense plethora of laws in a state legal system; or how to automatize intelligent agents to <a href="/2024/06/13/cognitive-architecture-decision-making-supply-chain">project provisioning orders</a> in a supply chain management.</p>

<p class="notice--primary">Courious on how sound arguments might help on explaining given recommendations? <a href="/patents/consumer-problem-discovery-solution-recommender-knowledge-graphs">Here an implementation</a> where products are suggested based on consumer preferences and grounded knowledge (knowledge graphs) to explain why a suggestion is better than another.</p>

<p>These are some example on how argumentation is permeating through not only our daily life but also the way we conduct business.</p>

<h1 id="elements-of-logic">Elements of logic</h1>
<p>To begin our endeavour in the intricacies of argumentation, we should deep into the mechanics and well-definable aspects of argumentation. We then start with logic reasoning. In here, I arrange the commonalities and peculiarities of 3 approaches to automatic argumentation: <code class="language-plaintext highlighter-rouge">ABA</code><sup id="fnref:aba" role="doc-noteref"><a href="#fn:aba" class="footnote" rel="footnote">7</a></sup>, <code class="language-plaintext highlighter-rouge">ASPIC+</code><sup id="fnref:aspic" role="doc-noteref"><a href="#fn:aspic" class="footnote" rel="footnote">8</a></sup> and <code class="language-plaintext highlighter-rouge">DeLP</code><sup id="fnref:delp" role="doc-noteref"><a href="#fn:delp" class="footnote" rel="footnote">9</a></sup>.</p>

<h3 id="negation">negation</h3>
<p>One of the most nuanced concept in logic programming is the negation. As we state in a casual conversation such as <em>“the accused is not guilty”</em> ($\neg guilty$) is the form of negation  closer to the <strong>strict negation</strong> one might find in classical logic. The <em>“classical”</em> adjective stands for the kind of logic elaborated by ancient Greeks, with Aristotle as the greatest contributor. According to the <a href="https://en.wikipedia.org/wiki/Law_of_excluded_middle">excluded middle hypothesis</a>, the accused is either guilty or he’s not. There are no shades of guilt - which are instead embraced by fuzzy logic - only black or white, if you let me indulge in the color metaphor. <br />
There also different concepts related to negation, in particular adopted in practical logical programming languages, which is the <strong>negation as failure</strong>. According to that interpretation, the negation as failure is a broader concept the reasoning principle of assuming the negation of a statement is true if the system cannot prove the statement itself. What does it means concretely? <br />
When in a supermarket I can’t find biscuits, I can assume that there is <em>no biscuits</em> in that supermarket. I may find them certainly elsewhere, but not there and not in that moment. The negation as failure consider the <strong>world-closed assumption</strong>: what can’t be found in the knowledge base (supermarket), simply doesn’t exist. This is particularly common in logic languages as <code class="language-plaintext highlighter-rouge">DeLP</code>, but might take slightly different names, such as <strong>default negation</strong>.</p>

<h3 id="defeasible-logic">defeasible logic</h3>
<p>Logic is the basic language of reasoning. Classical logic embrace the law of excluded middle, where what is already known to be true (or false) cannot change due to new incoming information (<strong>monotonicity</strong>). While those conditions simplify computation and the validation of logic statements, they are limiting the scope on which classical logical might apply:</p>
<blockquote>
  <p>logical argumentation has the ultimate goal to refuse or support a proposition, therefore the change of truth of any statement is an inherent and desirable property.</p>
</blockquote>

<p>Classical logic can’t find a whole place in here. It must be supersede by an elastic framework that contemplate dynamic changes of truth. This where <strong>defeasible logic</strong> (DL) come to place. In classical style, once a statement is established, it is no longer possible to alter it without falling in conflicts and inconsistencies. Defeasible logic instead, by adhering to the non-monotonic paradigm, embrace beliefs change. Welcoming fresh information, DL may alter previously established conclusions. A way to put it into a practical example could be the statement: <em>“the accused is not guilty if not proved otherwise”</em>. By default, the accused is not guilty unless it is proved the contrary: <br />
\(\begin{eqnarray}\neg guilty(x) &amp;\leftarrow accused(x), not\; guilty(x) \end{eqnarray}\)</p>

<p>With that we make use of default negation which is itself non-monotonic - the $not$ fails when the predicate $guilty(x)$ is activate. Another approach implemented in <code class="language-plaintext highlighter-rouge">ASPIC+</code> and <code class="language-plaintext highlighter-rouge">DeLP</code> is the introduction of defeasible rules, according to which a rule normally apply unless a more specific rule overrides the default one: \(\begin{eqnarray}
\label{accused} \neg guilty(x) &amp;\Leftarrow&amp; accused(x)\\ 
\label{guilty} guilty(x) &amp;\leftarrow&amp; evidence(accused(x))
\end{eqnarray}\)The rule ($\ref{accused}$) affirms that by default the accused is not guilty, unless it is proved the contrary. The following <strong>strict rule</strong>  ($\ref{guilty}$) expresses that if there are evidences against the accused, he <em>is</em> guilty  - no matter what could be subsequently determined.</p>

<p>With defeasible rules, there is a distinct separation between the default behavior usually assumed, and the <em>exceptions</em> that might occur. Some examples:</p>
<ul>
  <li>A bird usually flies ($\ref{defeasible}$), but a penguin does not ($\ref{strict}$).</li>
  <li>In ($\ref{dead}$) a combination of negation as failure and defeasible proposition translated as: <em>“if we’re not sure someone is dead, we can assume he’s alive”</em> - which is why death certificates exist.</li>
</ul>

\[\begin{eqnarray}
\label{defeasible} fly(x) &amp;\Leftarrow&amp; bird(x) \\
\label{strict} \neg fly(x) &amp;\leftarrow&amp; penguin(x) \\
\label{dead} alive &amp;\Leftarrow&amp; not\; dead \\
\end{eqnarray}\]

<p>The default rule has a specific notation ($\Leftarrow$) that distinguish its fallibility, while the other ($\leftarrow$) denotes a strict rule which has a higher ranking than the fallible one, and supersede all other lower ranking rules. Both rules can coexist in the same program without raising any conflict, but what happens when two or more conflicting defeasible rules apply? <code class="language-plaintext highlighter-rouge">DeLP</code> uses a <strong>dialectical process</strong> to decide which information prevails.</p>

<h3 id="dialectical-process">dialectical process</h3>
<p>In the sample below, if both $mosquito$ and $dengue$ applies, What is expected the system to assert? \(\begin{eqnarray}
\label{dengue} dangerous &amp;\Leftarrow&amp; mosquito, dengue\\
\label{mosquito} \neg dangerous &amp;\Leftarrow&amp; mosquito
\end{eqnarray}\)If we apply the <strong>specificity tactic</strong>, the rule ($\ref{dengue}$) should have higher priority because it is more specific. i.e. it entails more literals than ($\ref{mosquito}$). <br />
What if instead, the specificity tactic can’t be applied, because the rules’ specificity is the same?</p>

<p>Some notion of <strong>priority</strong> must be introduced in conflicting defeasible rules. It could be implemented by defining a series of inequalities representing the relative ranking order:\(\begin{eqnarray}
r1&amp;:&amp;walk &amp;\Leftarrow&amp; sunny, free\_time\\
r2&amp;:&amp;\neg walk &amp;\Leftarrow&amp; sunny, allergy\_season\\
\label{pref} r1 &amp;&gt;&amp; r2 
\end{eqnarray}\)</p>

<p>In this example, if $sunny$, \(free\_time\) and \(allergy\_season\) apply, it’s definitely better to go for a $walk$ rather than stay home.</p>

<p>An interesting representational device - the <strong>presumption</strong> - is the defeasible rule with an empty antecedent. As the case of strict/defeasible rules, the presumption introduces fallibility even for facts. <strong>Strict facts</strong> can’t change, they are infallible like <em>axioms</em> in mathematics.  Everything else,  such as: <br />
\(\begin{eqnarray}delivered(letter) \Leftarrow \{\}\end{eqnarray}\)<br />
is fallible and can be overturned by a contrasting deduction or a fact \(\begin{eqnarray}\neg delivered(letter) \Leftarrow wrong\_address(letter)\end{eqnarray}\)</p>

<h2 id="compelling-arguments">compelling arguments</h2>
<p>A common definition of argument in the initiatives for automating it, is defined as:</p>
<blockquote>
  <p>an argument is the minimal set of defeasible rules that allow for the defeasible derivation of a conclusion, ensuring no contradictory literals are derived.</p>
</blockquote>

<p>Taking it as a general line of principle of an argumentation system, what should we expect on querying $Q$ in a system? <code class="language-plaintext highlighter-rouge">DeLP</code> offers the most granular depiction of how an answer could be. For that, there are 4 possible answer to $Q$:</p>
<ul>
  <li>$YES$. If $Q$ has no counterargument standing against it, or they are all ultimately defeated. $Q$ is warranted.</li>
  <li>$NO$. If $\neg Q$ is warranted. if the query is <em>“Is the sky blue?”</em> and there exists a warranted argument for <em>“the sky is not blue”</em>.</li>
  <li>$UNDECIDED$. if $Q$ nor $\neg Q$ are warranted. This is a stalemate in the dialectical process. It could arise if there are conflicting arguments of equal strength, leading to a situation where neither side prevails.</li>
  <li>$UNKNOWN$. if $Q$ is not in the knowledge base $K$. In here, the system lacks the necessary information to even begin constructing arguments for or against $Q$. This is the response it would get if $K$ contains information about birds and their flying abilities, and $Q$ is about the price of tea, as this information is simply out of scope.<br />
This nuanced approach to determining the result comes from the system’s handling of potentially incomplete and inconsistent information.</li>
</ul>

<p>Though the logical argumentation initiatives acknowledge the value of structured and schematic segmentation of arguments, they don’t attempt to capture the Toulmin framework at that granularity level. The simpler structure of an argument is condensed as the tuple $\langle A,L \rangle$ where $A$ is the minimum set of rules (strict and defeasible) and $L$ is the claim. An argument is just a part of the whole system; in <code class="language-plaintext highlighter-rouge">DeLP</code> a defeasible logic program is described by $\langle \pi,  \Delta \rangle$ where $\pi$ stands for strict rules and facts,  on the other hand $\Delta$ indicates defeasible facts and rules. We can say $A$ is an argument for $L$ when:</p>
<ol>
  <li>Exists a defeasible derivation for $L$ from $\pi$.</li>
  <li>No contradictory literals can be defeasibly derived from $\pi$.</li>
  <li>If a rule in $A$ contains a negation ‘$not\; F$’, then $F$ can’t be in the defeasible derivation of $L$. This condition prevents circular reasoning and ensures that the argument is logically sound.</li>
  <li>It is also promoted the principle of <em>minimality</em>, for which $A$ should only contain the essential set of literals for claiming $L$.</li>
</ol>

<p>The second point could be explained in the following example. Consider a program $P$ formed by:<br />
\(\pi = \left\{ \begin{array}{l}
night \\
at\_market \\
\neg at\_home \leftarrow at\_market
\end{array} \right\}
\qquad
\Delta = \{at\_home \Leftarrow night\}\)</p>

<p>The same literal \(at\_home\) is strictly negated but also <em>reinstated</em> in $\Delta$ as defeasible rule. This rises a contradiction which this is forbidden in <code class="language-plaintext highlighter-rouge">DeLP</code>. If we consider other approaches such as the argumentation systems like GK<sup id="fnref:tammet" role="doc-noteref"><a href="#fn:tammet" class="footnote" rel="footnote">10</a></sup> contradictions are removed automatically from $P$ - with an eye on computational costs by performing recursive checks with time limits. We can overcome this kind of contradictions by introducing preferences.</p>
<h3 id="dialectic-tree-and-preferences">dialectic tree and preferences</h3>
<p>The argumentation is the process of deciding the status of all arguments in a program. The verdict whether an argument is valid or not is declared by defeated ($D$) or undefeated ($U$) statuses. Since we tend to avoid binary discrimination, there are also here plenty of flavors of acceptance in arguments. In <code class="language-plaintext highlighter-rouge">ASPIC+</code> are denoted 3 different types of attacks:</p>
<ul>
  <li><strong>Undercut:</strong> An argument undercuts another argument by questioning its premises or assumptions.</li>
  <li><strong>Rebut:</strong> An argument rebuts another argument by directly contradicting its conclusion.</li>
  <li><strong>Undermine:</strong> An argument undermines another argument by weakening its overall force or persuasiveness, without necessarily contradicting its conclusion directly.</li>
</ul>

<p>An argument can be directly attacked to its core claim, or indirectly by attacking warrant’s backing. Indirect attack happens when an argument attack  a sub-argument (which is a smaller argument, with its own premises, within a larger argument). As previously mentioned for defeasible logic, argumentation suffers from conditions of stalemate among arguments. The Nixon Diamond provides the proverbial example of blocking defeaters. Considering the following argument structure:\(\begin{aligned}
\langle A_1, L_1\rangle &amp;=&amp; \langle pacifist(nixon) &amp;\Leftarrow quacker(nixon), pacifist(nixon)\rangle \\  
\langle A_2,L_2 \rangle &amp;=&amp; \langle \neg pacifist(nixon) &amp;\Leftarrow republican(nixon), \neg pacifist(nixon)\rangle
\end{aligned}\)</p>

<p>The two arguments defeat each other, therefore the answer to $pacifist(nixon)$ will be $UNDECIDED$. Whether an attack from $\langle A_1, L_1 \rangle$  to $\langle A_2, L_2 \rangle$ succeeds as a defeat, may depend on the relative strength of $A_1$ and $A_2$ , i.e. whether $A_2$ is strictly stronger than, or strictly preferred to $A_1$. Where do these preferences come from? <br />
We have already previously ($\ref{pref}$) mentioned preferences in defeasible logic. All that an argumentation system wants is a binary ordering ($\le$) on the set of all arguments that can be constructed<sup id="fnref:modgil" role="doc-noteref"><a href="#fn:modgil" class="footnote" rel="footnote">11</a></sup>.<br />
The reader may observe that the structure of attacks resembles a tree (or acyclic graph as circular attacks are prohibited). The tree is the <strong>dialectic tree</strong> where each argument’s element can be defeated by other arguments. A single path within the tree is termed an <strong>argumentation line</strong>. If a single argumentation line successfully rebuts an argument, the latter is considered defeated and removed.<br />
<a href="">Giancarlo Frison</a></p>

<hr />
<div class="footnotes" role="doc-endnotes">
  <ol>
    <li id="fn:eemeren" role="doc-endnote">
      <p>van Eemeren and Grootendoorst, 2004. <a href="#fnref:eemeren" class="reversefootnote" role="doc-backlink">&#8617;</a></p>
    </li>
    <li id="fn:dung" role="doc-endnote">
      <p>Phan Minh Dung, On the acceptability of arguments and its fundamental role in non-monotonic reasoning, logic programming and n-person games, Artificial Intelligence, Volume 77, Issue 2, 1995, Pages 321-357. <a href="#fnref:dung" class="reversefootnote" role="doc-backlink">&#8617;</a></p>
    </li>
    <li id="fn:besnard" role="doc-endnote">
      <p>Besnard, Philippe, et al. “Introduction to structured argumentation.” <em>Argument &amp; Computation</em> 5.1 (2014): 1-4. <a href="#fnref:besnard" class="reversefootnote" role="doc-backlink">&#8617;</a></p>
    </li>
    <li id="fn:toulmin" role="doc-endnote">
      <p>Toulmin, S. E. (2003). The uses of argument (Updated ed.). Cambridge University Press. <a href="#fnref:toulmin" class="reversefootnote" role="doc-backlink">&#8617;</a> <a href="#fnref:toulmin:1" class="reversefootnote" role="doc-backlink">&#8617;<sup>2</sup></a></p>
    </li>
    <li id="fn:walton" role="doc-endnote">
      <p>Walton, D. (2004). A new dialectical theory of explanation. Philosophical Explorations, 7, 71–89. <a href="#fnref:walton" class="reversefootnote" role="doc-backlink">&#8617;</a></p>
    </li>
    <li id="fn:acave" role="doc-endnote">
      <p>acave, C., &amp; Diez, F.J. (2004). A review of explanation methods for heuristic expert systems. Knowledge Engineering Review, 19, 133–146. <a href="#fnref:acave" class="reversefootnote" role="doc-backlink">&#8617;</a></p>
    </li>
    <li id="fn:aba" role="doc-endnote">
      <p>Dung, Phan Minh, Robert A. Kowalski, and Francesca Toni. “Assumption-based argumentation.” <em>Argumentation in artificial intelligence</em> (2009): 199-218. <a href="#fnref:aba" class="reversefootnote" role="doc-backlink">&#8617;</a></p>
    </li>
    <li id="fn:aspic" role="doc-endnote">
      <p>Modgil, Sanjay, and Henry Prakken. “The ASPIC+ framework for structured argumentation: a tutorial.” <em>Argument &amp; Computation</em> 5.1 (2014): 31-62. <a href="#fnref:aspic" class="reversefootnote" role="doc-backlink">&#8617;</a></p>
    </li>
    <li id="fn:delp" role="doc-endnote">
      <p>García, Alejandro J., and Guillermo R. Simari. “Defeasible logic programming: An argumentative approach.” <em>Theory and practice of logic programming</em> 4.1-2 (2004): 95-138. <a href="#fnref:delp" class="reversefootnote" role="doc-backlink">&#8617;</a></p>
    </li>
    <li id="fn:tammet" role="doc-endnote">
      <p>Tammet, T., Draheim, D., Järv, P. (2022). GK: Implementing Full First Order Default Logic for Commonsense Reasoning (System Description). In: Blanchette, J., Kovács, L., Pattinson, D. (eds) Automated Reasoning. IJCAR 2022. Lecture Notes in Computer Science(), vol 13385. Springer, Cham. https://doi.org/10.1007/978-3-031-10769-6_18. <a href="#fnref:tammet" class="reversefootnote" role="doc-backlink">&#8617;</a></p>
    </li>
    <li id="fn:modgil" role="doc-endnote">
      <p>Modgil, Sanjay and Prakken, Henry. ‘The <em>ASPIC</em> + Framework for Structured Argumentation: a Tutorial’. 1 Jan. 2014 : 31 – 62. <a href="#fnref:modgil" class="reversefootnote" role="doc-backlink">&#8617;</a></p>
    </li>
  </ol>
</div>]]></content><author><name>Giancarlo Frison</name></author><category term="argumentation" /><category term="logic" /><summary type="html"><![CDATA[Computational argumentation uses strict and defeasible rules to model the logical structure of arguments and evaluate their validity by considering facts, counterarguments, and preferences. Let's figure it out...]]></summary></entry><entry><title type="html">Cognitive Architectures for Business Decision Making in Supply Chain Management</title><link href="https://gfrison.com/2024/06/13/cognitive-architecture-decision-making-supply-chain" rel="alternate" type="text/html" title="Cognitive Architectures for Business Decision Making in Supply Chain Management" /><published>2024-06-13T00:00:00+02:00</published><updated>2024-06-13T00:00:00+02:00</updated><id>https://gfrison.com/2024/06/13/cognitive-architecture-decision-making-supply-chain</id><content type="html" xml:base="https://gfrison.com/2024/06/13/cognitive-architecture-decision-making-supply-chain"><![CDATA[<p>Most companies rely on a network of suppliers to provide the components or materials they need to create their final products. This supplier network allows businesses to add value and earn a profit by transforming these raw materials into finished goods. However, this dependence on suppliers introduces a significant risk for profit-driven businesses: the potential inability to meet customer demands at the agreed-upon conditions (price and timing).<br />
What could possibly go wrong in the chain of suppliers? The risk could be caused by unexpected:</p>
<ol>
  <li>unavailability of suppliers for:
    <ol>
      <li>own internal reasons</li>
      <li>suppliers’ suppliers availability risk</li>
      <li>issues in the distribution</li>
    </ol>
  </li>
  <li>rising costs due to:
    <ol>
      <li>distribution chain</li>
      <li>raw material</li>
      <li>lowering of competitive alternatives in the market</li>
    </ol>
  </li>
  <li>geo-political issues
    <ol>
      <li>wars</li>
      <li>international sanctions</li>
    </ol>
  </li>
  <li>natural adversities</li>
</ol>

<p>This short dissertation aims to analyze the architecture of a potential <em>cognitive system</em> designed to alleviate these supplier-related problems.</p>

<p><strong>What is a cognitive system?</strong> Cognitive systems, in essence, are intelligent agents capable of learning, reasoning, adapting to new environments, and leveraging past experiences to continuously improve. The key advantage? They can deliver high-value functionalities with minimal human intervention.</p>

<p><strong>What is a cognitive system architecture?</strong> In the realm of software, designing an architecture means to trace the invariant aspects and components that meet expected requirements and functionalities. Two well-established cognitive architectures, SOAR and ACT-R, have been under development for decades and will be the foundation for this exploration.</p>

<h1 id="system-goals">System goals</h1>
<p>As it clearly appears, nearly all elements involved in the corporate’s values chain may be impacted by supply management, the procurement will influence all the other elements even top business priorities. This is why a system that fulfill the functionalities proposed here will be important for almost all companies and surely for manufacturing ones.</p>

<blockquote>
  <p>The goals of the system is to help analysts to <em>mitigate</em> risks related to the supply network and provide more suitable alternatives to lower those risks.</p>
</blockquote>

<p>How would the system actually help?<br />
The system is intended to provide probabilistic estimations on a variety of natural language queries. It will elaborate how much target forecasts diverges from current estimations according to real-time analysis. The purpose is to raise issues that may undermine the predicted costs for the business, and to raise awareness on auspicable chances to re-organize the supply chain.</p>

<p>Not limited to that, the system should provide estimations in the current state of affairs and also in hypothetical situations artificially setup by tweaking initial conditions. <br />
If you are think of a simulation engine, this is exactly what I mean with that. <br />
The tool should elaborate possible consequences and, in cascade (recursively), analyze further outcomes from them. Analysts may advance simulations on particular conditions, specifying by natural language propositions, some constraints such as:</p>
<blockquote>
  <p>What will be the impact on the deliveries for supplier $XYZ$ on a increase of +30% of the cargo shipping costs?</p>
</blockquote>

<p>Types of simulation may involve complex aspect of related to change in demand from their customer side as in <em>how the supply provisioning will be affect on % increase product X and % decrease of product Y?</em></p>

<p>Considering a particular simulation on changes in the manufacturing process, queries may be of interest for industrial managers where processes changes will drive to different procurement due to innovations and optimizations:</p>
<blockquote>
  <p>What are the costs estimations for our smartphones if we replace plastic cover with alloy masked covers?</p>
</blockquote>

<h1 id="data--system-heterogeneity">Data &amp; system heterogeneity</h1>
<p>For performing the wished functionalities the system should elaborate a massive quantity of facts, analyze them, evaluate their impact on direct suppliers and their upstream suppliers. The generality of the term “analysis” hides numerous computational intensive tasks for capturing regularities in the data, so it would much easier to replicate past decisions once they have been <em>learned</em><sup id="fnref:6" role="doc-noteref"><a href="#fn:6" class="footnote" rel="footnote">1</a></sup>. By <em>impact</em> it is intended of <em>events</em> that have direct effect on carrying goods around the world and their production, and also events that may be indicative of changes in <em>mood</em> regarding a particular technology or company or geopolitical area. <em>Rumors</em> usually are seriously taken into account by decision makers, but at the same time they might lead often to false positives as well.</p>

<p>Data awareness<sup id="fnref:7" role="doc-noteref"><a href="#fn:7" class="footnote" rel="footnote">2</a></sup> is just one of the ingredients we need for this recipe. Business processes are not designed by tons of terabyte of data, rather they are intentionally crafted by experts and the decisions on setting up how things should be made up for their business are derived to general ways to conduct a specific process (therefore generalizable in templates) but they could also be unique for that business. We may generalize process modelling as:</p>
<ul>
  <li>hierarchical. Because processes are generalizable into typical (following the prototype theory<sup id="fnref:8" role="doc-noteref"><a href="#fn:8" class="footnote" rel="footnote">3</a></sup>) cases</li>
  <li>non-monotonic. Peculiarities are the norm, and every process may differ from the general one.</li>
  <li>compositional. Process models are associative and composable into macro-processes.</li>
</ul>

<h2 id="data-heterogeneity">Data heterogeneity</h2>
<p>Differently from expert input - used for defining how actual industrial and business processes work - all the other type of input is inherently unstructured and noisy. The evaluated architectures propose an approach that tackles the challenge of reconciling symbolic and connectionist representations for robust AI. This <em>heterogeneity</em> allows for diverse data formats and enables the integration of symbolic (cognitivist, based on symbolic logic) and emerging (connectionist, based on neural networks) approaches within a single system by leveraging existing architectures such as SOAR and ACT-R.</p>

<h1 id="cognitive-features">Cognitive features</h1>
<p>Despite the high degree of uncertainty, ambiguity, and common sense – qualities even humans struggle with – automatic systems must grapple with these challenges. These processes require a set of cognitive skills. But what exactly is cognition within the realm of automatic systems? In the realm of automatic systems, Vernon<sup id="fnref:4" role="doc-noteref"><a href="#fn:4" class="footnote" rel="footnote">4</a></sup> has several depictions of what cognition might be:</p>
<blockquote>
  <p>Cognition is the process by which an autonomous system perceives its environment, learns from experience, anticipates the outcome of events, acts to pursue goals, and adapts to changing circumstances.</p>
</blockquote>

<p>I would point to this more intriguing definition extracted again from his book “Artificial Cognitive Systems”:</p>
<blockquote>
  <p>Cognition is the means by which the system compensate for the immediate ‘here and now’ nature of perception, allowing it to anticipate events that occur over longer timescales and prepare for interaction that may be necessary for the future.</p>
</blockquote>

<p>Cognition is the Swiss-knife of intelligent animals that strive to maximize their surviving chances in the environment where they live. Adapting to uncertain circumstances is not an exception, therefore the ability to self-improving is one among first principles:</p>
<blockquote>
  <p>Cognition is the result of a developmental process through which the system becomes progressively more skilled and acquires the ability to understand. Anticipation and sense-making are the direct result of the developmental process.</p>
</blockquote>

<p>Let’s summarize the reasons why the system should pursue the virtues of cognition:</p>

<h2 id="inferences">Inferences</h2>
<p>The ability to make inferences about events that might impact a business process. Obviously the most important feature. Those events include the actions a company should ruled out according to the system recommendations. To make such inferences, the system should look back into the past and combine that experience with given processes definitions</p>

<h2 id="feedback-loop">Feedback-loop</h2>
<p>To notice when performance are degrading, identify the reason of such low outcome and take corrective evaluation. The system should engage a feedback loop in order to minimize estimation errors. This feedback could be negative or positive according to the predictions. The feedback could be seen automatically from historic data (supervised learning) but also from experts that can help to draw a causality network of believes used by the system to infer unseen situations.</p>

<h2 id="autonomy">Autonomy</h2>
<p>The automatic agent should actively trigger alarms concerning its goals and inform about risks and/or actions to take. For that, the system should not only ingest/parse/elaborate the information input, but autonomously search for those sources in order to improve its decision-making capability.</p>

<h2 id="continuous-learning">Continuous learning</h2>
<p>If we assume that cognitive architecture forms the basis for intelligent system in uncertain environments, it should incorporate a flexible approach toward learning and skills improvements.</p>

<p><em>While narrow ML system distinguish training from inference, and expert systems take the given knowledge as granted, a cognitive system should develop an understanding of the world by constructing theories that explain and predict events and behaviors.</em></p>

<p>Continuous learning adhere to the theory-theory view<sup id="fnref:2" role="doc-noteref"><a href="#fn:2" class="footnote" rel="footnote">5</a></sup> by which the agents don’t passively accumulate knowledge, but they actively construct and revise (by leveraging counter-factuals) continuously what understood. This reflects a pillar of cognitive system referred as <em>development</em>. It states the importance of self-improvement and it is achieved by self-modification, in this view not only the environment perturb the system, but the system perturb itself <sup id="fnref:3" role="doc-noteref"><a href="#fn:3" class="footnote" rel="footnote">6</a></sup>.</p>

<h2 id="simulations">Simulations</h2>
<p>The system should run simulations for predicting alarms might happen in the supply chain. A simulation occurs anytime there is change in the information, even though it does not affect the tree of suppliers. This is due to the cascading consequences events may cause to apparently unrelated entities.</p>

<h2 id="interaction-with-other-agents">Interaction with other agents</h2>
<p>For making sense of the world, cognitive agents have to transform the knowledge acquired through senses into meaningful <em>affordances</em> with which they can achieve their goals. The appreciations of semantic concepts into meaningful concepts is obtained not only by the interconnection with other concepts, but also by interaction with other agents and exchanging those concepts for knowledge sharing. Vernon perfectly summarize that in:</p>
<blockquote>
  <p>the meaning of the knowledge that is shared is negotiated and agreed by consensus through interaction</p>
</blockquote>

<p>Though the supply chain agent would certainly benefit of this skills, remaining dubious its applicability in business contexts. Intellectual properties that the system is intended to conceptualize are secret by definition, therefore the propensity of sharing information that may leak confidential data (business internal structures, process models of any kind) is approximately zero. Even non confidential data may be not shareable for the risk to advantage competing agents.</p>

<h1 id="overview-of-main-cognitive-architectures">Overview of main cognitive architectures</h1>
<p>Over decades of research, the field of cognitive systems have seen the rise of a multitude of different designs for what could be defined <em>the artificial cognitive architecture</em>. The two more articulated approaches that have been developed are the SOAR and the ACT-R architecture. The have been inspired by diametrical different motivations. The ACT-R is definitely inclined toward modelling and replicating human cognitive processes, while the former was driven by the ambition to supersede the functionalities of an intelligent cognitive system, whether or not inspired by existing biology.<br />
It turns out that despite the differences, the two paradigms are not so far from each other, a grossly simplified similarity is here provided.</p>

<h3 id="comparison-soaract-r">Comparison SOAR/ACT-R</h3>
<p>In both architectures, persists the separation of declarative memory (the set of info that we may think consciously of them when we need) from procedural memory (enacted and automated skills, such as riding a bike). Both approaches are mostly based on symbolic representation though actions (or operators in SOAR) are evaluated against continuous ranges of utilities which is adopted as reward for reinforced learning. <br />
While the concept borrowed from cognitive psychology about <em>working memory</em> is a pillar of both systems, it is implemented in different ways. In SOAR, working memory is a centralized space for imminent processing, in ACT-R it is a sparse swap space between modules and modules -&gt; central processing. Being both goal oriented, the approach taken for decision making in SOAR differs and deserve a mention. <br />
Whenever there are multiple operators to select, SOAR attempts to fine tune the raking in order to let just one operator (action to take) to emerge. It solve such <em>impasse</em> setting intermediate goals with the purpose of discriminate the best operator. Among the list of differences, I find it interesting the different learning consolidation between the two, surely due to the different foundational motivations. in ACT-R a rule is consolidated (stored and actionable in procedural memory) after a series of re-learning, in SOAR the consolidation is immediate after its production.</p>

<h1 id="cognitive-system-structure">Cognitive system structure</h1>

<h2 id="data-input">Data input</h2>
<p>For performing the expected analysis the system requires different type of information: process models and sparse factual data. The first define the processes that the system is intended to optimize for what supply management concerns. The second is the flow of information that might trigger cascading events that affect evaluations and alarms. The processes models are transcribed and supervised by experts but also derived by already existing representations such as <a href="https://en.wikipedia.org/wiki/Business_process_modeling">BPM</a> models, so it is both a manual and automatic process. The data flow, on the other hand, is completely automatic, and data source may be listed as:</p>
<ol>
  <li>suppliers info:
    <ol>
      <li>official financials reports. E.g.: quarterly CEO reports.</li>
      <li>news &amp; rumors from social media/magazines.</li>
    </ol>
  </li>
  <li>central banks communications. Interest rates might strongly affect costs of debts.</li>
  <li>inventory reports. Aggregated changes on company inventories can say something regarding raw material trend prices.</li>
  <li>Oil prices. Anything that might affect transportation cost.</li>
  <li>International political tensions. Clashes between countries/groups may result in chain disruptions.</li>
  <li>important local events like Olympiads or G7 meetings may alter the usual exchange of goods and the freedom of movement of people.</li>
  <li>natural disasters.</li>
  <li>historic records of all above for training purposes.<br />
Summarizing the type of input I would list them into:</li>
  <li>news text streams</li>
  <li>company and supplier models that describe business and internal processes.processes are described by symbolic means through some domain specific language such as BPM or other logical and temporal language. Such representation should be denotative and connotative for objects and processes <sup id="fnref:1" role="doc-noteref"><a href="#fn:1" class="footnote" rel="footnote">7</a></sup></li>
</ol>

<h2 id="architecture-overview">Architecture overview</h2>
<p>What is <em>architecture</em> in the context of a cognitive system? Again, Vernon<sup id="fnref:4:1" role="doc-noteref"><a href="#fn:4" class="footnote" rel="footnote">4</a></sup> comes to rescue with:</p>
<blockquote>
  <p>…architecture has been borrowed by many other disciplines to serve as a catch-all term for the technical speciﬁcation and design of any complex artifact. Just as with architecture in the built environment, system architecture addresses both the conceptual form and the utilitarian functional aspects of the system, focusing on inner cohesion and self-contained completeness.</p>
</blockquote>

<p>The <em>forms follow functions</em><sup id="fnref:5" role="doc-noteref"><a href="#fn:5" class="footnote" rel="footnote">8</a></sup> approach adopted for this project drag attention toward the SOAR architecture for a variety of reasons such as:</p>
<ol>
  <li>It is adopted a completely functionalist approach. It’s irrelevant whether the set of cognitive processes are closer to the current state of knowledge on how human brain process information. While it is important to consider the implications an intelligent system might have for accomplishing its functions, it is not vital that they are adherent to the human mind.</li>
  <li>The knowledge the system may handle would be always incomplete, therefore chain of reasoning will always encounter <em>impasses</em>. Impasse will be the natural intermediate state of any evaluation since the semantic and the procedural knowledge would never be suited for all encountered fact consequences.<br />
However, some ideas will be borrowed from ACT-R as well:</li>
  <li>Production compilation. It is a learning task devoted to aggregate multiple rules for composing macro-functionalities. Rule re-usability is an optimization milestone for speed-up learning of complex production rules.</li>
</ol>

<h2 id="modules">Modules</h2>

<figure class="align-center">
  <a href="/assets/images/soar-cos623.jpg">
  <img src="/assets/images/soar-cos623.jpg" alt="" /></a>
  <figcaption>Fig. 1 cognitive system architecture</figcaption>
</figure>

<ol>
  <li>News Digester. The module transform the incoming data from news feed (but also many other sources, see above description) into propositional statements according to first order logic. Those statements will be stored in triple-store databases, which constitutes the episodic long-term memory.</li>
  <li>Chunking. This module is inspired by the homonym component in SOAR. It is devoted to learn new behaviors/actions in certain conditions. Learning means creating new rules that can accomplish better estimations. Since the knowledge encoding is symbolic, chunking would comprise learning methods for discrete representations such as probabilistic inductive logic programming, that combines deterministic learning with <em>probabilistic likelihood</em> of the output propositions.</li>
  <li>Simulator. How the system is supposed to answer questions regarding forecasts in case of hypothetical events and conditions? This is what the simulator is entitled for. It run all existing rules all over all defined processes until a number of <a href="https://en.wikipedia.org/wiki/Stable_model_semantics"><em>stable models</em></a> are generated by the system. Those models represent a state might occurs in a determined time step, therefore they will be used on extracting the requested metric (delivery date, prices, etc..) in that particular case. Such models are probabilistic, that means they are associated with degree of uncertainty that is reflected to the requested metrics as well.</li>
  <li>Optimizer. Paired with the simulator, an optimization engine will act as a <em>rational daemon</em> for the chain of decisions a player involved in the business processes is supposed to behave. The simulation will involve game theoretical analysis because external agents’ decisions must be taken into account on evaluating requested metrics. For example, changing supplier might put the excluded one in the conditions to offer their products/services at discount cost to the competitors, initiating a cascading chain or eroded margin that are not welcome by shareholders. The optimizer would trigger that alarm during the simulation, informing the analysts of possible earning risks.</li>
  <li>Scenario builder. This is the user interface module, for gathering constraints and given events for running simulations. The module will allow to retrieve specific evaluation metrics for the current state of events as well.<br />
Above the list comprehend computational modules intended to elaborate information which is stored and represented in the storage modules described below:</li>
  <li>Episodic long-term memory. This database holds all propositions that regards events and facts. The information is saved in triple-store databases.</li>
  <li>Semantic long-term memory. In here, the information representation is different from the episodic memory. A <em>probabilistic likelihood</em> is associated to proposition rules, which might have particular additional characteristics such as: <em>symmetry</em> and <em>transitivity</em>. This information module is <em>recursive</em> in the way that it is possible to specify recursive relations that have the advantage to be very <em>declarative</em> and concise in their definition. Of course, the database should be capable to deal with recursive structures and implement particular policies for optimizing the computation of (virtually infinite) depth of computation.</li>
  <li>Working memory. Among the storage modules this is the most <em>active</em>, in the sense that it is supposed to cache the information that is temporary generated but at the same time, WM should <em>anticipate</em> the demand of information from long (and slow) term memories. Therefore, particular expedients should be employed to optimizing database lookups.</li>
  <li>Procedural memory. This is the place for storing internal/external production and business processes models and they are provided by knowledgeable operators. This is the core of the expert system side of the entire system, where the human experience is encoded for being processed by the automatic system.</li>
</ol>

<h5 id="footnotes">footnotes</h5>

<div class="footnotes" role="doc-endnotes">
  <ol>
    <li id="fn:6" role="doc-endnote">
      <p>I want to remark the meaning of <em>learning</em> since it is commonly associated with pattern matching in machine learning tasks. Learning could be defined as a meta-function that lays - in hierarchy of cognitive function of human’s mind - above perception, memory, thinking, acting. The learning process encodes (transforms) in organized mental representation in connection with other known concepts. <a href="#fnref:6" class="reversefootnote" role="doc-backlink">&#8617;</a></p>
    </li>
    <li id="fn:7" role="doc-endnote">
      <p><em>“Becoming a data-aware organisation means <strong>being able to see data opportunities and risks and translate them to actions</strong>. For that, you need an organisation that looks at projects from the data point of view and a few data specialists (depending on the size of your team) who can put that perspective into practice.”</em> - <a href="https://www.partos.nl/nieuws/data-awareness-main-takeaways/#:~:text=Becoming%20a%20data%2Daware%20organisation,put%20that%20perspective%20into%20practice.">source</a> <a href="#fnref:7" class="reversefootnote" role="doc-backlink">&#8617;</a></p>
    </li>
    <li id="fn:8" role="doc-endnote">
      <p><em>prototype theory is a theory of categorization in cognitive science, particularly in psychology and cognitive linguistics, in which there is a graded degree of belonging to a conceptual category, and some members are more central than others.</em> - <a href="https://en.wikipedia.org/wiki/Prototype_theory">source</a> <a href="#fnref:8" class="reversefootnote" role="doc-backlink">&#8617;</a></p>
    </li>
    <li id="fn:4" role="doc-endnote">
      <p>Vernon, David. <em>Artificial cognitive systems: A primer</em>. MIT Press, 2014. <a href="#fnref:4" class="reversefootnote" role="doc-backlink">&#8617;</a> <a href="#fnref:4:1" class="reversefootnote" role="doc-backlink">&#8617;<sup>2</sup></a></p>
    </li>
    <li id="fn:2" role="doc-endnote">
      <p>https://en.wikipedia.org/wiki/Theory-theory <a href="#fnref:2" class="reversefootnote" role="doc-backlink">&#8617;</a></p>
    </li>
    <li id="fn:3" role="doc-endnote">
      <p>Cognitive psychology uses different and more glamoured terms, I mention them since there is an intersection of different discipline and it is extremely interesting to notice that. For that respect, <em>ontogeny</em> match with the definition of development, to not confuse with <em>phylogeny</em> which refers to the change of agent’s specie through evolution. <a href="#fnref:3" class="reversefootnote" role="doc-backlink">&#8617;</a></p>
    </li>
    <li id="fn:1" role="doc-endnote">
      <p>Like in the “physical symbol systems” (Newell and Simon 1975) <img src="/assets/images/symbol-system.png" alt="" class="width-half" /> <a href="#fnref:1" class="reversefootnote" role="doc-backlink">&#8617;</a></p>
    </li>
    <li id="fn:5" role="doc-endnote">
      <p>The idea “form follows function” was given by the architect Louis Sullivan. <a href="#fnref:5" class="reversefootnote" role="doc-backlink">&#8617;</a></p>
    </li>
  </ol>
</div>]]></content><author><name>Giancarlo Frison</name></author><category term="cognitive system" /><category term="artificial intelligence" /><category term="supply chain" /><summary type="html"><![CDATA[Short dissertation on how supply chain management might benefit from the adoption of a cognitive system, how the architecture might be modeled and the main challenges on pursuing the projects]]></summary></entry><entry><title type="html">The Curious Case of Tip of the Tongue phenomenon</title><link href="https://gfrison.com/2024/04/11/tip-tongue" rel="alternate" type="text/html" title="The Curious Case of Tip of the Tongue phenomenon" /><published>2024-04-11T00:00:00+02:00</published><updated>2024-04-12T00:00:00+02:00</updated><id>https://gfrison.com/2024/04/11/tip-tongue</id><content type="html" xml:base="https://gfrison.com/2024/04/11/tip-tongue"><![CDATA[<p>Stuck on a name? It’s like having a word stuck right on the tip of your tongue, but you just can’t remember it. Like a ghost, it is driving us toward a right direction with a sense of closeness but without never revealing us the name. Imagine trying to open a door with a bunch of keys, but none of them fit. That’s the tip of the tongue feeling (TOT), and it happens to everyone sometimes.</p>

<p>The TOT state is also difficult to examine, because as you might imagine, it is identified only subjectively and escapes an objective scrutiny. How do we recognize to be in a TOT state? If you are unable to think of the word but <em>you feel sure</em> that you know it and that it is on the <em>verge of coming back</em> to you, then you are in a TOT state<sup id="fnref:1" role="doc-noteref"><a href="#fn:1" class="footnote" rel="footnote">1</a></sup>.  However, it’s important to distinguish TOT states from a similar phenomenon known as the feeling of knowing (FOK). While during a TOT we have the sensation to be just one step away to pronounce the answer, FOKs emit different feelings. They gives us the resignation that we <em>don’t know the target word</em>, but we can <em>eventually recognize</em> the right answer among some alternatives.</p>

<p class="notice--primary">Interested in language? Take a look into <a href="/2018/06/13/basic-principles-language">basic principles of language</a></p>

<p>If you feel uncomfortable experiencing a TOT, it would be pleasant to you to know that usually the feeling associated lasts for no longer than a minute, but be aware that half of the times a TOT is never solved. The sensation of being able to retrieve the word is accompanied by clues that usually appear to be correct, such as the first letter<sup id="fnref:2" role="doc-noteref"><a href="#fn:2" class="footnote" rel="footnote">2</a></sup> and even the number of syllables of the target word. Psychologists found that dictionary definitions of uncommon English words would be enough for triggering a word-finding failure in their experiments. That’s why rare words, along with celebrity names, are used so widely in TOT experiments.</p>

<p>While we know all the sensation related to TOT, not all individuals experience this state with the same frequency. Students in their twenties might have one or two per week, while the frequency doubles for elderly people in their 80s. Age disparities seem to be the most correlated trait with this phenomenon, corroborating causal hypothesis that center their attention on cognitive degradation. While it seems a direct and intuitive explanation, this is not the end of the story. We will unfold some interesting reasons behind TOTs.</p>
<h1 id="hypothesis-survey">hypothesis survey</h1>
<p>The effect has been under scrutiny since its first mention<sup id="fnref:4" role="doc-noteref"><a href="#fn:4" class="footnote" rel="footnote">3</a></sup> in the 19th century, but while early hypothesis were formulated in the thirties of last century, the TOT state is a prolific arena of research even nowadays. What makes TOT happening? Fingers are pointed against cognitive processes that involve memory and language. In particular, the linking bridge between the semantic memory - dedicated to memorize causal relation among concepts - and the phonological loop, the mental system entitled to vocalize words. How those modules may give us the sensation of TOT?</p>
<figure>
  <img src="/assets/images/tot-diagram.png" alt="connection semantic and phonological systems" />
  <figcaption>Fig. 1 - representation of connections between semantic and phonological systems</figcaption>
</figure>

<p>When we try to remember, the meaning of the word allows direct and immediate access to its sound, thus enabling vocalization. However, sometimes the phonological entry is not fully activated, instilling a certainty of knowing in the absence of expression ability. What could be the origin that causes those impediments between semantic and phonological systems? There are several hypothesis that have been evolved with time.</p>
<h3 id="incomplete-retrieval-hypothesis">incomplete retrieval hypothesis</h3>
<p>Let’s think of remembering a word. The more clues and reference you have the easier is to narrow down your search. When there are many paths to the final destination - the target word - it will be more likely the retrieval will be successful. Each clue acts as a signpost, pointing you closer to the target word<sup id="fnref:5" role="doc-noteref"><a href="#fn:5" class="footnote" rel="footnote">4</a></sup>. According to this idea, if there are not so many paths for finding the right word, it would be hard to solve the search. Therefore, it is a <em>lack of related clues</em> that afflicts the identification of the correct answer. If you find this theory appealing, you might be disappointed by some findings<sup id="fnref:6" role="doc-noteref"><a href="#fn:6" class="footnote" rel="footnote">5</a></sup> that are undermining the validity of this hypothesis as <em>solely</em> responsible of TOT.</p>

<p>In the experiment, participants were fired with casual words exactly when they were experiencing TOTs. Those “bullets” (metaphorically speaking) were either related or unrelated to the target. It happened that unrelated words don’t affect the resolution of the TOT. Instead, related ones have an influence but in opposite way than predicted by this theory: more related words -&gt; less successful resolutions.</p>
<h3 id="blocking-hypothesis">blocking hypothesis</h3>
<p>The experiment<sup id="fnref:6:1" role="doc-noteref"><a href="#fn:6" class="footnote" rel="footnote">5</a></sup> above may support a complementary explanation for TOTs. What if some clues instead of favoring retrieval, they prevent it? It is something that Freud mentioned<sup id="fnref:7" role="doc-noteref"><a href="#fn:7" class="footnote" rel="footnote">6</a></sup> when we associate a target name too early during the resolution phase, in the way:</p>
<blockquote>
  <p><em>“although immediately recognized as false, nevertheless obtrude themselves with great tenacity”</em></p>
</blockquote>

<p>According to the blocking hypothesis, it is not just that a wrong candidate diverts attention away from the right word, it actually <em>competes against the right target</em> as solution. Related words obstruct the reach of the target. It is this impediment the responsible of holding back good resolutions<sup id="fnref:8" role="doc-noteref"><a href="#fn:8" class="footnote" rel="footnote">7</a></sup>. <br />
Similar words have such power that it is not even necessary to provide them directly. Unconscious alternative words that come to mind that are too weak to be recall may be capable of inhibiting target retrieval<sup id="fnref:9" role="doc-noteref"><a href="#fn:9" class="footnote" rel="footnote">8</a></sup> (somehow compatible with the knowledge hypothesis, hold on to the following chapters).</p>
<figure>
  <img src="/assets/images/tot-cenerentola.png" alt="TOT as the prince in the Cinderella story" />
  <figcaption>Fig. 2 - The price (memory search) try to find the definition that fits (Cinderella), but the ugly stepsisters (blockers) intercept the search effort</figcaption>
</figure>

<p>Also the blocking hypothesis present some weaknesses. According to the theory, an higher rate of TOT should be associated with a corresponding higher number of alternative words reported during the TOT state - because the theory prescribes - it is the alternatives that are blocking the target, right? Actually not. A substantial amount of TOTs are reported without any alternative word<sup id="fnref:10" role="doc-noteref"><a href="#fn:10" class="footnote" rel="footnote">9</a></sup>. Also in this case, evidences do not clearly support the examined theory.</p>
<h3 id="inhibition-deficit-hypothesis">inhibition deficit hypothesis</h3>
<p>It is worthy to mention that another hypothesis is closely related to the blocking hypothesis. The inhibition deficit hypothesis (IDH) emphasize the inhibitory control of speech. According to IDH, the retrieval is successful when we can get rid of all distracting information and reach the target without being overwhelmed by irrelevant details. A TOT state is due to the confusing and distracting amount of <em>extraneous information</em> (This is a more formal term that means not essential or necessary. May sound a bit pedant, but I like this term 😊) that impedes a straight resolution<sup id="fnref:11" role="doc-noteref"><a href="#fn:11" class="footnote" rel="footnote">10</a></sup>. In support of this, the capacity of excluding noise during cognitive processes degrades with age. Aged people can less effectively shout out distractors than youngers, explaining why elderly people experience more TOTs than youngers. On the other hand, even this theory is not the whole picture. There is an amount of evidence that shows older adults produce instead <em>fewer</em> alternates during a TOT compared to younger people<sup id="fnref:12" role="doc-noteref"><a href="#fn:12" class="footnote" rel="footnote">11</a></sup>. For just this reason elderly people should be affected less, and not more by TOT. Something here does not add up.</p>
<h3 id="transmission-deficit-hypothesis">transmission deficit hypothesis</h3>
<p>This model suggests that TOT states arise from impairments in the <em>communication</em> between various cognitive modules or brain regions, rather than blaming individual systems. The retrieval of phonological traits might be particularly susceptible to linking failures because it is dependent on fragile connections between a word’s lexical node (lemma) and its phonology (Fig. 1). In contrast, the semantic and lexical systems are more connected. That’s why they are more resistant to aging decay than the lexical/phonological linking<sup id="fnref:13" role="doc-noteref"><a href="#fn:13" class="footnote" rel="footnote">12</a></sup>.<br />
If you find many similarities with incomplete retrieval hypothesis, you are not wrong. Both point to a lack of strong connections. Interestingly, this is why some neurological evidences that enlighten aging-related issues, support all of them. Older adults exhibit reduced activity in the left insula - probably due to age-related atrophy<sup id="fnref:15" role="doc-noteref"><a href="#fn:15" class="footnote" rel="footnote">13</a></sup> -  an important region for phonological production<sup id="fnref:14" role="doc-noteref"><a href="#fn:14" class="footnote" rel="footnote">14</a></sup>.</p>
<h3 id="knowledge-hypothesis">knowledge hypothesis</h3>
<p>This idea comes with diametrical different assumptions than the previous ones. It suggests that TOTs aren’t caused by declining brainpower, but by having <em>too much knowledge</em>. When experiments’ results are controlled over the type of target words, something new pop-ups from the data. The author states that in the experiments that do not require proper names (like celebrity names) there is no age difference in TOT rates<sup id="fnref:9:1" role="doc-noteref"><a href="#fn:9" class="footnote" rel="footnote">8</a></sup>. This seems to be a quite incisive discovery. <br />
If we take the transmission deficit hypothesis as granted, more you know about the target word more the semantic system compensate the aging effect. Instead, data shows that higher the knowledge, the higher the TOT rate. This aligns somewhat with the blocking hypothesis, where too many related words might get in the way.</p>
<h3 id="illusion-hypothesis">illusion hypothesis</h3>
<p>Among the explanations this is the most controversial one. It says that the emotional charge carried during a TOT state is an induced feeling that trigger our curiosity on the target and raise willingness on searching the right word. It seems a mental trick that tell us: <em>“hey, for sure you know the answer, keep going!”</em>, to push us for searching more obstinately. As we can find out in the next chapter, this is not really a disadvantage; rather it is a positive feature of TOT. In situations of great uncertainty this <em>meta-cognitive</em> decision can be helpful on pursuing the right action<sup id="fnref:16" role="doc-noteref"><a href="#fn:16" class="footnote" rel="footnote">15</a></sup>. In support of this thesis, it has been found out that TOTs are more frequent in groups. Group magnify the feeling of the target word is on the reach, prompting TOT states more than what occurs on single individuals<sup id="fnref:17" role="doc-noteref"><a href="#fn:17" class="footnote" rel="footnote">16</a></sup>.</p>
<h1 id="upsides-of-tot-states">upsides of TOT states</h1>
<p>Remember the “tip-of-the-tongue” feeling? It might be a good thing. As mentioned above for the illusion hypothesis, TOT state is beneficial because it induces more curiosity and motivation to spend more energy to reach an achievement<sup id="fnref:16:1" role="doc-noteref"><a href="#fn:16" class="footnote" rel="footnote">15</a></sup>. Moreover, the confidence of the apparently latent knowledge pushes us into more risky decisions. For example, participants in test quizzes are more willing to risk their grades if they are in TOT state. It turns out that such a bravery is prized by better results<sup id="fnref:16:2" role="doc-noteref"><a href="#fn:16" class="footnote" rel="footnote">15</a></sup>and then the frustrating feelings of inability are somehow balanced by higher performances.</p>
<h1 id="practical-applications">practical applications</h1>
<p>This phenomenon open the way to practical applications, for example in the area of  assessing students’ knowledge. If they can’t come up with a short answer, they could maybe use TOTs to know when it’s smart to ask for multiple-choice options. Quiz designers might exploit TOT effect to build <em>adaptive tests</em>, to let participant to demonstrate various levels of knowledge, including knowledge that might be present but momentarily inaccessible<sup id="fnref:3" role="doc-noteref"><a href="#fn:3" class="footnote" rel="footnote">17</a></sup>.</p>
<h1 id="conclusion">conclusion</h1>
<p>This survey of potential explanations for the TOT phenomenon hasn’t yielded a definitive answer. Instead, the evidence suggests that various hypotheses likely contribute to TOTs, each with varying degrees of influence. Tough not privileging a single explanation, we can de-emphasize the incomplete retrieval hypothesis due to the lack of compelling supporting evidence. The blocking hypothesis, however, might be encompassed by the knowledge hypothesis. Since age-related cognitive decline is a well-established factor, it could not be ignored as contributing factor in TOTs.</p>
<h1 id="more-curiosities"><em>more curiosities</em></h1>
<p>There are also some tricks could be used to mitigate those kind of retrieval failures. If the blocking hypothesis affirms that if we can’t discern useful clues from irrelevant noise we are more affected by TOT. So why not <em>consciously</em> attempt to ignore related but incorrect words? As Woodworth (1938) suggested:</p>
<blockquote>
  <p>…the wrong name recalled acquires a recency value and blocks the correct name … a rest<br />
interval allows the recency value of the error to die away</p>
</blockquote>

<p>A curious aspect of TOTs has been found out on the ethical sphere. The sensations associated with TOT  is a kind of <em>warm glove</em> that attributes high ethical values to <em>not able to remember</em> memories. Curiously, when celebrities’ names induce TOT states, they are judge more ethical<sup id="fnref:3:1" role="doc-noteref"><a href="#fn:3" class="footnote" rel="footnote">17</a></sup>.</p>

<p><a href="https://gfrison.com">Giancarlo Frison</a></p>

<div class="footnotes" role="doc-endnotes">
  <ol>
    <li id="fn:1" role="doc-endnote">
      <p>(p. 329). R. Brown and McNeill (1966). The “tip of the tongue” phenomenon. <em>Journal of Verbal Learning &amp; Verbal Behavior, 5</em>(4), 325–337. Doi: 10.1016/S0022-5371(66)80040-3 <a href="#fnref:1" class="reversefootnote" role="doc-backlink">&#8617;</a></p>
    </li>
    <li id="fn:2" role="doc-endnote">
      <p>Usually correctly guessed 50% of the times. Rubin, D. C. (1975). Within word structure in the tip-of-the-tongue phenomenon. Journal of Verbal Learning and Verbal Behavior, 14, 392-397. <a href="#fnref:2" class="reversefootnote" role="doc-backlink">&#8617;</a></p>
    </li>
    <li id="fn:4" role="doc-endnote">
      <p>James W. Principles of psychology. New York: Holt; 1890. <a href="#fnref:4" class="reversefootnote" role="doc-backlink">&#8617;</a></p>
    </li>
    <li id="fn:5" role="doc-endnote">
      <p>Wenzl (1932, cited in Woodworth, 1938), R. Brown (1970) <a href="#fnref:5" class="reversefootnote" role="doc-backlink">&#8617;</a></p>
    </li>
    <li id="fn:6" role="doc-endnote">
      <p>Jones, G.V. Back to Woodworth: Role of interlopers in the tip-of-the-tongue phenomenon. <em>Memory &amp; Cognition</em> <strong>17</strong>, 69–76 (1989). Doi: 10.3758/BF03199558 <a href="#fnref:6" class="reversefootnote" role="doc-backlink">&#8617;</a> <a href="#fnref:6:1" class="reversefootnote" role="doc-backlink">&#8617;<sup>2</sup></a></p>
    </li>
    <li id="fn:7" role="doc-endnote">
      <p>Freud (1901, cited in Reason &amp; Lucas, 1984) <a href="#fnref:7" class="reversefootnote" role="doc-backlink">&#8617;</a></p>
    </li>
    <li id="fn:8" role="doc-endnote">
      <p>Burke et al., 1988; Reason &amp; Lucas, 1984; Jones, 1989; Jones &amp; Langford, 1987 <a href="#fnref:8" class="reversefootnote" role="doc-backlink">&#8617;</a></p>
    </li>
    <li id="fn:9" role="doc-endnote">
      <p>Burke et al, 1991 - <em>On the tip of the tongue: What causes word finding failures in young and older adults?</em> Doi: 10.1016/0749-596X(91)90026-G <a href="#fnref:9" class="reversefootnote" role="doc-backlink">&#8617;</a> <a href="#fnref:9:1" class="reversefootnote" role="doc-backlink">&#8617;<sup>2</sup></a></p>
    </li>
    <li id="fn:10" role="doc-endnote">
      <p>Reason, J. T., &amp; Lucas, D. (1984). Using cognitive diaries to investigate naturally occurring memory blocks. <em>Everyday memory, actions and absent-mindedness</em>, 53-70. <a href="#fnref:10" class="reversefootnote" role="doc-backlink">&#8617;</a></p>
    </li>
    <li id="fn:11" role="doc-endnote">
      <p>Awh, Matsukura &amp; Serences, 2003; McClelland &amp; Rumelhart, 1981; Ridderinkhof, Band, &amp; Logan, 1999 <a href="#fnref:11" class="reversefootnote" role="doc-backlink">&#8617;</a></p>
    </li>
    <li id="fn:12" role="doc-endnote">
      <p>Burke et al., 1991; Burke &amp; Shafto, 2004; Fraas et al., 2002; White &amp; Abrams, 2002 <a href="#fnref:12" class="reversefootnote" role="doc-backlink">&#8617;</a></p>
    </li>
    <li id="fn:13" role="doc-endnote">
      <p>Cognition language and aging doi10.1075/z.200; Derived from a theory of language production called the Node Structure Theory (MacKay, 1987). <a href="#fnref:13" class="reversefootnote" role="doc-backlink">&#8617;</a></p>
    </li>
    <li id="fn:15" role="doc-endnote">
      <p>Shafto, Meredith A., et al. “On the tip-of-the-tongue: Neural correlates of increased word-finding failures in normal aging.” <em>Journal of cognitive neuroscience</em> 19.12 (2007): 2060-2070. <a href="#fnref:15" class="reversefootnote" role="doc-backlink">&#8617;</a></p>
    </li>
    <li id="fn:14" role="doc-endnote">
      <p>Shafto, Meredith A., et al. “Word retrieval failures in old age: the relationship between structure and function.” <em>Journal of Cognitive Neuroscience</em> 22.7 (2010): 1530-1540. <a href="#fnref:14" class="reversefootnote" role="doc-backlink">&#8617;</a></p>
    </li>
    <li id="fn:16" role="doc-endnote">
      <p>Metcalfe et al.,2017; Schwartz &amp; Cleary, 2016 <a href="#fnref:16" class="reversefootnote" role="doc-backlink">&#8617;</a> <a href="#fnref:16:1" class="reversefootnote" role="doc-backlink">&#8617;<sup>2</sup></a> <a href="#fnref:16:2" class="reversefootnote" role="doc-backlink">&#8617;<sup>3</sup></a></p>
    </li>
    <li id="fn:17" role="doc-endnote">
      <p>Socially Shared Feelings of Imminent Recall: More Tip-of-the-Tongue States Are Experienced in Small Groups - Rousseau, Kashur 2021. <a href="#fnref:17" class="reversefootnote" role="doc-backlink">&#8617;</a></p>
    </li>
    <li id="fn:3" role="doc-endnote">
      <p>The tip-of-the-tongue state as a form of access to information: Use of tip-of-the-tongue states for strategic adaptive test-taking - Cleary 2021 <a href="#fnref:3" class="reversefootnote" role="doc-backlink">&#8617;</a> <a href="#fnref:3:1" class="reversefootnote" role="doc-backlink">&#8617;<sup>2</sup></a></p>
    </li>
  </ol>
</div>]]></content><author><name>Giancarlo Frison</name></author><category term="cognitive psycology" /><category term="memory" /><category term="learning" /><category term="language" /><summary type="html"><![CDATA[a brief survey of the tip of the tongue phenomenon. Why it occurs? It is really because of age? Those and many other questions will be answered.]]></summary></entry><entry><title type="html">Short glimpse on predicate and first order logic</title><link href="https://gfrison.com/2023/12/17/glimpse-over-predicate-first-order-logic" rel="alternate" type="text/html" title="Short glimpse on predicate and first order logic" /><published>2023-12-17T00:00:00+01:00</published><updated>2024-02-28T00:00:00+01:00</updated><id>https://gfrison.com/2023/12/17/glimpse-over-predicate-first-order-logic</id><content type="html" xml:base="https://gfrison.com/2023/12/17/glimpse-over-predicate-first-order-logic"><![CDATA[<p>Course: <a href="https://learn.ki-campus.org/courses/foundationsofai-III-dfki2021">Foundations of Artificial Intelligence III</a><br />
This course spares topics that can be listed in 3 main groups. Propositional logic (PL), first order logic (FOL), reasoning and satisfiability. The literature mentioned in this course refers to Chapters 7,8 and 9 of AIMA book. The example applications used in the course, is (among many other) the <em>wumpus world</em> (inspired by the one in the AIMA book) which is applied as a <em>playground</em> for PL and FOL explanations. <strong>Logic</strong> as a general class of representations to support knowledge-based agents. Such agents can combine and recombine information to suit myriad purposes. A logic must also define the semantics or meaning of sentences. The semantics defines the truth of each sentence with respect to each possible world. For example, the semantics for arithmetic specifies that the sentence “<code class="language-plaintext highlighter-rouge">x + y =4</code>” is true in a world where x is <code class="language-plaintext highlighter-rouge">2</code> and y is <code class="language-plaintext highlighter-rouge">2</code>, but false in a world where x is 1 and y is 1.</p>

<h2 id="propositional-logic">Propositional Logic</h2>
<p><strong>PL</strong> is a simple language consisting of proposition symbols and logical connectives. Its <strong>syntax</strong> defines the allowable sentences while its <strong>semantics</strong> defines the rules for determining the truth (just <code class="language-plaintext highlighter-rouge">true</code> or <code class="language-plaintext highlighter-rouge">false</code>) of PL sentences with respect to a particular model. The <strong>semantics</strong> for propositional logic must specify how to compute the truth value of any sentence, given a model. This is done recursively. All sentences are constructed from atomic sentences and the five connectives.<br />
A <em>model</em> is a truth assignment of propositions in a knowledge base (KB) which is a set of sentences (<strong>axioms</strong>) when the sentence is taken as given without being derived from other sentences. Sentences can be derived from other sentences that are logically entailed. <strong>Entailment</strong> is the idea that a sentence follows logically from another sentence. In mathematical notation, we write α $\vDash$ β if and only if, in every model in which α is true, β is also true.</p>
<h4 id="inference">Inference</h4>
<p>Sentence derivation is done by running an inference algorithm that follows the <em>modus ponens</em> logic paradigm or the <em>modus tollens</em>. The former refers to <em>deductive</em> <strong>forward chaining</strong> (FC) while the latter the <em>inductive</em> <strong>backward chaining</strong> (BC) families of algorithms. FC is an example of the general concept of data-driven reasoning; reasoning in which the focus of attention starts with the known data. BC algorithms, as its name suggests, works backward from the query. If the query $q$ is known to be true, then no work is needed. Otherwise, the algorithm finds those implications in the knowledge base whose conclusion is $q$.</p>
<h4 id="resolution">Resolution</h4>
<p>For a sentence to be proved as true, it must be <strong>sound</strong> and <strong>complete</strong>. A sentence is <strong>valid</strong> if it is true in all models. For example, the sentence P ∨ $\neg$P is valid. Valid sentences are also known as <strong>tautologies</strong>, they are necessarily true. If a sentence is valid and its premises are true, then it is also <strong>sound</strong>. For being complete, the inference algorithm must derive any sentence that is entailed. <strong>Proofing</strong> is obtained by <em>reductio ad absurdum</em> (or proof by refutation or contradiction) on which α $\vDash$ β if and only if the sentence (α ∧ $\neg$β) is unsatisfiable. How resolution works for obtaining a proof?</p>
<ul>
  <li>First, $KB \wedge \negα$ is converted in CNF. CNF stands for <strong>conjunction normal form</strong> (formula consists only of conjunction of disjunctions).</li>
  <li>Each pair that contains complementary literals is resolved to produce new clauses, until:
    <ul>
      <li>There are no new clauses that can be added, in which case KB does not entail $α$.</li>
      <li>Two clauses resolve to yield the empty clause, in which case KB entails $α$.<br />
<img src="/assets/images/resolution.png" alt="example of resolution" /></li>
    </ul>
  </li>
</ul>

<h2 id="first-order-logic">First order logic</h2>
<p>PL lacks the expressive power to concisely describe an environment with many objects.<br />
The language of FOL is built around objects and relations and it assumes that the world consists of objects with certain relations among them that do or do not hold. While propositional logic commits only to the existence of facts, FOL commits to the existence of objects and relations and thereby gains expressive power. In FOL are represented objects, predicates and functions. Predicates are $n$-arity relations among objects or employed for expressing features of a single object. Functions are expressions that return a single object out of a set of arguments. <br />
In FOL, it is natural to express properties of entire set of objects and it is done by adopting the <strong>existential quantifier</strong> ($\exists$) and the <strong>universal quantifier</strong> ($\forall$).</p>
<h4 id="natural-numbers">Natural numbers</h4>
<p>An interesting application of FOL is the description of <a href="https://en.wikipedia.org/wiki/Peano_axioms">Peano numbers</a> which are Peano numbers are a simple way of representing <em>recursively</em> the natural numbers ($Nat$) using only an axiom ($Zero$) value and a successor function $succ(Nat)$. <br />
\(Nat(Zero).
\forall x [Nat(x) \rightarrow Nat(succ(x))].\)</p>
<h4 id="unification">Unification</h4>
<p>This is the process to make - whenever possible - different logical expressions look the same by substituting properly the value of variables. With unification it is possible to construct all queries that unifies with a given sentence, e.g.: $Employes(SAP, Giancarlo)$ and $Male(Giancarlo)$ some queries might be: <em>is there are male employee in SAP?</em> In FOL it might be: $\exists x [Male(x) \wedge Employes(SAP,x)]$ .</p>
<h4 id="reasoning-in-fol">Reasoning in FOL</h4>
<p>Reasoning in FOL works by bringing the formula into <em>Skolem</em> form - by removing existential and universal quantifiers - and transform it in <em>clause normal form</em> (which indicate a formula composed of conjunctions separated by comma). Use  propositional reasoning (resolution, SAT), forward cha backward chaining.</p>
<h4 id="herbrand-universe">Herbrand universe</h4>
<p>The Herbrand Universe (HU) is a set of combinations of all ground terms (not variables) present in a formula, e.g.: in the clause formula $CapitalOf(x,y) \rightarrow IsA(x,City), IsA(y,Country), PartOf(x,y)$ we have the HU as $HU = {City,Country}$ but if there would be a function, the HU size will be <em>infinite</em>. The HU is useful for example to restrict the search space of Prolog programs, whenever it is possible.  <br />
The HU could play a part on <em>resolution in FOL</em> by applying the <strong>Herbrand expansion</strong> which is the set of formulas the results of substituting terms in the initial formula in all possible ways. Given a knowledge base: $KB = \forall x [SpecialAgent(x) \rightarrow  SpiesOn(x, Danz)] \wedge SpecialAgent(MrSmith)$ <br />
it will be translated as  <em>“every special agent spies on Danz and MrSmith is a special agent”</em>.<br />
If we advance the hypothesis that formula $\phi=SpiesOn(MrSmith, Danz)$ in entailed by KB ($KB \vDash \phi$), the Herbrand expansion could be applied to show that $HE(KB \wedge \neg\phi)$ is unsatisfiable.</p>]]></content><author><name>Giancarlo Frison</name></author><category term="logic" /><category term="artificial intelligence" /><summary type="html"><![CDATA[Course: Foundations of Artificial Intelligence III This course spares topics that can be listed in 3 main groups. Propositional logic (PL), first order logic (FOL), reasoning and satisfiability. The literature mentioned in this course refers to Chapters 7,8 and 9 of AIMA book. The example applications used in the course, is (among many other) the wumpus world (inspired by the one in the AIMA book) which is applied as a playground for PL and FOL explanations. Logic as a general class of representations to support knowledge-based agents. Such agents can combine and recombine information to suit myriad purposes. A logic must also define the semantics or meaning of sentences. The semantics defines the truth of each sentence with respect to each possible world. For example, the semantics for arithmetic specifies that the sentence “x + y =4” is true in a world where x is 2 and y is 2, but false in a world where x is 1 and y is 1.]]></summary></entry><entry><title type="html">a brief account of the intersection between perception and attention</title><link href="https://gfrison.com/2023/11/29/essay-perception-attention" rel="alternate" type="text/html" title="a brief account of the intersection between perception and attention" /><published>2023-11-29T00:00:00+01:00</published><updated>2023-11-29T00:00:00+01:00</updated><id>https://gfrison.com/2023/11/29/essay-perception-attention</id><content type="html" xml:base="https://gfrison.com/2023/11/29/essay-perception-attention"><![CDATA[<p>Attention and perception are two fundamental elements for enabling cognitive processes that allow us to be functioning in the environment where we are striving to survive. Likewise any different aspect of functionalities in complex systems, single traits should not seen solely on their own but in conjunction of all other ones that contribute to compounded behaviors.<br />
Attention and perception do not make an exception to this rule.</p>

<p>As in the established definition of <em>agent</em><sup id="fnref:1" role="doc-noteref"><a href="#fn:1" class="footnote" rel="footnote">1</a></sup>, the sensation is the process of acquiring environmental information from the environment through sensors. The behaviour the agent manifests is the result of pondered actions, activated by proper strategies that aim to achieve desirable goals.  Living creatures, including humans, do not deviate from this general definition. Attention could be seen as a function that help on elaborating successful strategies in efficient way by partitioning what really can affect the goals, and what it does not <sup id="fnref:2" role="doc-noteref"><a href="#fn:2" class="footnote" rel="footnote">2</a></sup>.</p>

<p>Sensation brings its meaning directly from the raw information acquisition from the senses. They include of course the 5 canonical ones, but not limited of them, they involve also movement, position of the body, pain and temperature. It could be surprising how much sensitive are humans senses. For example, it is possible to detect a candle almost 50km away in a dark and clear night <sup id="fnref:6" role="doc-noteref"><a href="#fn:6" class="footnote" rel="footnote">3</a></sup>. On the other hand, perception is a more abstract concept.</p>

<p>When perception is closely tied up with sensation it is defined as <em>bottom-up perception</em>. In this case, perception is the interpretation of information from the environment so we can identify its meaning <sup id="fnref:3" role="doc-noteref"><a href="#fn:3" class="footnote" rel="footnote">4</a></sup> and enable more or less accurate predictions about what should be there. Perception, on the other hand, emerges also from cognitive processes that involve memory and attention, completely detached from sensory circuits. This scenario is described as <em>top-down</em> perception, and it is a process strongly mediated by expectations and the contextual setup during which those sensations are collected <sup id="fnref:7" role="doc-noteref"><a href="#fn:7" class="footnote" rel="footnote">5</a></sup>. In the image below, the shadow is <em>expected</em> to lower the brightness of the <code class="language-plaintext highlighter-rouge">B</code> region, then a kind of visual compensation innately make us thinking that the region <em>should</em> be brighter without shadow, generating the famous visual illusion.</p>

<table>
  <thead>
    <tr>
      <th><img src="https://upload.wikimedia.org/wikipedia/commons/thumb/b/be/Checker_shadow_illusion.svg/440px-Checker_shadow_illusion.svg.png" alt="" /></th>
      <th><img src="https://upload.wikimedia.org/wikipedia/commons/thumb/2/21/Grey_square_optical_illusion_proof2.svg/440px-Grey_square_optical_illusion_proof2.svg.png" alt="" /></th>
    </tr>
  </thead>
  <tbody>
    <tr>
      <td> </td>
      <td> </td>
    </tr>
  </tbody>
</table>

<p>Attention and perception are highly correlated and it is very evident in every day life. If we are walking in the university department looking for the Cognitive Psychology professor, I think it won’t be surprising if we pay attention on people faces by <em>visual searching</em> in the crowd until the target <em>pops out</em> in the hall. The eye movement will be directed to scan the faces of the people <sup id="fnref:8" role="doc-noteref"><a href="#fn:8" class="footnote" rel="footnote">6</a></sup>. If expectations are violated by novel surprises (e.g.: an old mate in the classroom) these are explored extensively <sup id="fnref:5" role="doc-noteref"><a href="#fn:5" class="footnote" rel="footnote">7</a></sup>. The other face of the coin is <em>inattentional blindness</em> by which what it is not attended during the attention phase is not consciously perceived. For example, 60-80% of the observers do not attend the center point which turns into a text while a distracting cross appears in the left <sup id="fnref:9" role="doc-noteref"><a href="#fn:9" class="footnote" rel="footnote">8</a></sup> in the figure below.</p>

<p><img src="/assets/images/cognitive1.jpg" alt="" /></p>

<p>I’ve mentioned <em>popping out</em> by describing the subjective effect may have when something <em>catches the attention</em> with no efforts. It is established that this is an interesting topic that shows particular dynamic depending on the environmental context. If the search task involves picking a simple feature out of a context disseminated of <em>distracting</em> details, it is much easier than searching for complex pattern with a combination of features - the so called <em>conjunctive search</em> <sup id="fnref:10" role="doc-noteref"><a href="#fn:10" class="footnote" rel="footnote">9</a></sup>. In the experiment below, recognizing a simple red dot among all green ones is automatic, while the case in the right pane involves some degree of attentive scrutiny which is much slower and more resources demanding.</p>

<p><img src="/assets/images/cognitive2.jpg" alt="" /></p>

<p>This might suggest that attention is somehow necessary for recognize non-trivial objects <sup id="fnref:4" role="doc-noteref"><a href="#fn:4" class="footnote" rel="footnote">10</a></sup>. The <strong>feature integration theory</strong> <sup id="fnref:4:1" role="doc-noteref"><a href="#fn:4" class="footnote" rel="footnote">10</a></sup> affirms that automatic feature processing is followed by attentive processes to bind the features into a whole object. Objects are not a mere list of independent features and perception does not limit on <em>pattern matching</em> but also on identifying high order structures, with the support of controlled attention. <br />
Just as the letter <code class="language-plaintext highlighter-rouge">N</code> is not simply a casual aggregation of 3 segments, things must obey to some <em>grammar</em> for being recognized. <br />
A demonstration of the importance of structural relations is depicted in the image below, where a set of simple forms (geons) can easily compose complex objects <sup id="fnref:11" role="doc-noteref"><a href="#fn:11" class="footnote" rel="footnote">11</a></sup>. That demonstrates that not only the relational information if needed, but it is more critical to perception than the features themselves <sup id="fnref:12" role="doc-noteref"><a href="#fn:12" class="footnote" rel="footnote">12</a></sup>.</p>

<p><img src="/assets/images/cognitive3.jpg" alt="" /></p>

<p>While some information need attention to retrieve complex structures from senses, an equal demand of effort is required to selectively ignore conflicting information while performing some tasks. This is the case raised by John Stroop in his famous experiment <sup id="fnref:13" role="doc-noteref"><a href="#fn:13" class="footnote" rel="footnote">13</a></sup>, that demonstrates the difficulty of partitioning contrasting information on one side, and useful one to the other. In the Stroop task, an observer reports the color of appearing words, while the words points to a different color’s name. It appears that the task is more difficult than the no-contrasting information setting (when the name matches the color). Observers activate different parts of the brain on discordant stimuli processing, those parts are in charge of executive control and <em>selective attention</em> functions. For the same reason, when we stop at a cross-light while driving a car, we discriminate the cross-light of our own lane from the adjacent cross-lights. Selective attention prevents us to switch from the brake to the accelerator pedal when it is not the case.</p>

<p>The selective attention theory states that perception is filtered <em>before</em> being processed by high level mechanisms, but it clashes with the notable cocktail party effect (CPE). Unfortunately for Broadbent, the CPE seems to confine the selective theory not as the only way the brain deal with attention. The CPE describes the capability of being caught by unattended stimuli once they present an important pattern. For example, while we’re on talking with friends and someone out of the interlocutor’s circle will loudly mention our name, our attention will be probably triggered by this event toward the speaker and his discourse, even though we were previously unaware of that discussion.</p>

<p>An interesting effect of unattended stimuli is that they interfere with attended perceptions. The experiments that put light on this effect, are the ones that enforce <em>shadowing</em>, by which observers are instructed to follow only one stream of perception among many. If two recorded discourses are played simultaneously, one to the left side, the other to the right side of the observers’ headset:</p>
<blockquote>
  <p>a: “They were standing near the bank…”<br />
b: “the silicon valley bank has gone bankrupt…” ^[It is not exactly the example of the experiment, since that bank went bankrupt on 2023, but I guess it is equivalent to the purposes of the experiment]</p>
</blockquote>

<p>Observers disambiguates the term <em>bank</em> with the <em>financial bank</em>, and not with other possible meanings <sup id="fnref:14" role="doc-noteref"><a href="#fn:14" class="footnote" rel="footnote">14</a></sup>. This experiment tell us that though attention allocates sufficient resources to <em>spotlight</em> some stimuli, it leaves space for unconscious mechanisms that capture information for the unattended channel, for blending it with the attended one.</p>

<p>Giancarlo Frison</p>

<div class="footnotes" role="doc-endnotes">
  <ol>
    <li id="fn:1" role="doc-endnote">
      <p>Russell, Stuart J.; Norvig, Peter; Artificial Intelligence: A Modern Approach  (2003, 2nd ed.); Chapter 2. <a href="#fnref:1" class="reversefootnote" role="doc-backlink">&#8617;</a></p>
    </li>
    <li id="fn:2" role="doc-endnote">
      <p>There is no clear definition of what attention is: <em>“No one knows what attention is”</em> (Hommel et al., 2019).  The first attempt to categorize it comes from William James in The Principles of Psychology (1890): <em>“is the taking possession by the mind, in clear and vivid form, of one out of what may seem several simultaneously possible objects or trains of thought…It implies withdrawal from some things in order to deal effectively with others”</em>. In the Schema Theory (Neisser, 1976), <em>attention is a dynamic process that seeks information consistent with current situation</em>. I think a good synthesis of many definitions could be summarized into <em>“attention is the allocation of resources and processing to a particular object, region, dimension”</em>. <a href="#fnref:2" class="reversefootnote" role="doc-backlink">&#8617;</a></p>
    </li>
    <li id="fn:6" role="doc-endnote">
      <p>Okawa &amp; Sampath, 2007 <a href="#fnref:6" class="reversefootnote" role="doc-backlink">&#8617;</a></p>
    </li>
    <li id="fn:3" role="doc-endnote">
      <p>William Wozniak. <a href="https://www.apa.org/ed/precollege/topss/lessons/sensation.pdf">Sensation and Perception</a> <a href="#fnref:3" class="reversefootnote" role="doc-backlink">&#8617;</a></p>
    </li>
    <li id="fn:7" role="doc-endnote">
      <p>Neisser, 1976 <a href="#fnref:7" class="reversefootnote" role="doc-backlink">&#8617;</a></p>
    </li>
    <li id="fn:8" role="doc-endnote">
      <p>Yarbus, 1967 <a href="#fnref:8" class="reversefootnote" role="doc-backlink">&#8617;</a></p>
    </li>
    <li id="fn:5" role="doc-endnote">
      <p>The metaphor of brain as a predictive machine found matches for example on Jeff Hawkins - On Intelligence (2004), and Karl Friston - The free-energy principle: a unified brain theory? (2010). The latter reduces agents as surprising minimizers; the former adapt the free energy principle to human cognition, on which automatic processing will escalate to high form of deliberate decision making (throughout attention mechanisms) whenever the automatic layer does not know what to do in certain circumstances. <a href="#fnref:5" class="reversefootnote" role="doc-backlink">&#8617;</a></p>
    </li>
    <li id="fn:9" role="doc-endnote">
      <p>Mack, Rock 1998 <a href="#fnref:9" class="reversefootnote" role="doc-backlink">&#8617;</a></p>
    </li>
    <li id="fn:10" role="doc-endnote">
      <p>Treisman &amp; Gelade, 1980; Treisman &amp; Sato, 1990 <a href="#fnref:10" class="reversefootnote" role="doc-backlink">&#8617;</a></p>
    </li>
    <li id="fn:4" role="doc-endnote">
      <p>Anne Treisman, Garry Gelade; A feature-integration theory of attention; (1980). <a href="#fnref:4" class="reversefootnote" role="doc-backlink">&#8617;</a> <a href="#fnref:4:1" class="reversefootnote" role="doc-backlink">&#8617;<sup>2</sup></a></p>
    </li>
    <li id="fn:11" role="doc-endnote">
      <p>Biderman, 1987 <a href="#fnref:11" class="reversefootnote" role="doc-backlink">&#8617;</a></p>
    </li>
    <li id="fn:12" role="doc-endnote">
      <p>Biderman, 1985 <a href="#fnref:12" class="reversefootnote" role="doc-backlink">&#8617;</a></p>
    </li>
    <li id="fn:13" role="doc-endnote">
      <p>Stroop, John Ridley -Studies of interference in serial verbal reactions (1935) <a href="#fnref:13" class="reversefootnote" role="doc-backlink">&#8617;</a></p>
    </li>
    <li id="fn:14" role="doc-endnote">
      <p>MacKay, 1973 <a href="#fnref:14" class="reversefootnote" role="doc-backlink">&#8617;</a></p>
    </li>
  </ol>
</div>]]></content><author><name>Giancarlo Frison</name></author><category term="cognitive psychology" /><category term="memory" /><category term="attention" /><summary type="html"><![CDATA[Attention and perception are two fundamental elements for enabling cognitive processes that allow us to be functioning in the environment where we are striving to survive. Likewise any different aspect of functionalities in complex systems, single traits should not seen solely on their own but in conjunction of all other ones that contribute to compounded behaviors. Attention and perception do not make an exception to this rule.]]></summary></entry><entry><title type="html">the neurosymbolic nature of Conceptual Spaces</title><link href="https://gfrison.com/2023/08/04/conceptual-spaces-neuro-symbolic" rel="alternate" type="text/html" title="the neurosymbolic nature of Conceptual Spaces" /><published>2023-08-04T00:00:00+02:00</published><updated>2023-08-04T00:00:00+02:00</updated><id>https://gfrison.com/2023/08/04/conceptual-spaces-neuro-symbolic</id><content type="html" xml:base="https://gfrison.com/2023/08/04/conceptual-spaces-neuro-symbolic"><![CDATA[<blockquote>
  <p>“This is the promise of the Semantic Web – it will improve all the areas of your life where you currently use syllogisms. Which is to say, almost nowhere.”</p>

  <p><em>—Clay Shirky</em></p>
</blockquote>

<blockquote>
  <p>“Fortunately, a large majority of the information we want to express is along the lines of ‘a hex-head bolt is a type of machine bolt.’</p>

  <p><em>—Berners-Lee</em></p>
</blockquote>

<blockquote>
  <p>“Unfortunately this is not true. If one considers how humans handle concepts, the class relation structures of the Semantic Web capture only a minute part of our information about concepts”</p>

  <p><em>—Peter Gärdenfors</em></p>
</blockquote>

<p>I guess that it doesn’t come without notice that the semantic web approach isn’t the favourite of the author of the Conceptual Spaces (CS).<br />
Two distinguished views on knowledge representation are the semantic web and the vectorized embeddings that belong to the symbolic and the connectionist schools respectively. The CS theory comes from a very different vision on how knowledge should be encoded. Concepts in CS are without doubt, close to vectorized embeddings representation though they preserve interpretability  - a strength of the symbolic world.</p>

<h2 id="vector-embeddings">Vector embeddings</h2>
<p>It’s not hard to think many of you have already heard about word embeddings for knowledge graphs. Modern natural language processing tasks based on neural networks would not be in there without vectorized embeddings. We’re witnessing the immense progress of natural language processing (ex: GPT and related) in recent times, and their machine learning algorithms rely on word embeddings. A branch of deep learning named graph neural networks (GNN) have brought a similar advancement on machine learning tasks such as link prediction in ontologies. The intuition behind neural network embeddings is that words or graph nodes can be represented as a series of real numbers that embed the semantics, and programs of a very special kind - the neural networks - can use them to accomplish some specific task. Vectorized embeddings come as a byproduct of those processes where there is not contemplated human supervision in the loop. Embeddings, differently from ontologies, don’t convey any semantics that is comprehensible, and even less edited, by people.</p>

<p>If requested during a talk in a conference, I can hardly imagine someone raising the hand because of past experience with CS.</p>

<p>In the CS framework, concepts are represented as vectors of numbers that are continuous in their spectrum and convex, which means that similarity scores and distances among concepts are naturally derivable, but those vectors are not imperscrutable to human scrutiny. Concepts are built taking in account how we process and categorise them.</p>

<h2 id="properties-as-region-spaces">Properties as region spaces</h2>
<p>Domains represent a single quality. They are convex and differentiable because they can be represented in real values. A concept is qualified via a set of domains. An apple has a round shape, a colour in the range of green, yellow and red, a taste (which is apparently related to the colour), a size, a weight. Likewise neural embeddings, similar concepts cluster together in regions. In CS, concepts are defined by vectorized properties where each of them describe a quality in a specific domain:</p>
<ul>
  <li>The colour domain could be represented as a three dimensional space, where each dimension defines the intensity of one the three basic colours, red, blue and green. The RGB notation is convex in relation to what we perceive as colours. The two ends of RGB scale are <code class="language-plaintext highlighter-rouge">#000000</code> (black) and <code class="language-plaintext highlighter-rouge">#FFFFFF</code> (white) and this is numerically consistent with our perception, from nothing to all colours together. Further, a slight change in one parameter implies a small visual change, as more or less reddish, bluish or greenish depending on which parameter.</li>
  <li>Consider now the price dimension. 43,78€ is certainly a number but also a currency, thereby the price could be encoded in 2 dimensions.</li>
  <li>Think of time dimension, it is a single dimension domain, where zero is the present, a positive value is projected in the future while the negative numbers are settled in the past. Its magnitude instead, can describe how far that point is from the current present.</li>
</ul>

<blockquote>
  <p>_“The things have weight,mass, volume, size, time, shape, colour, position, texture, duration, density, smell, value,consistency, depth,boundaries, temperature, function, appearance, price, fate, age, significance.</p>

  <p>The things have no peace.”_</p>

  <p><em>—Arnaldo Antunes</em></p>
</blockquote>

<p>The main similarity with neural embeddings and CS is that both are differentiable and convex. At the same time, concepts incorporate symbolism with their multi-facets domains; they naturally enable <a href="/2023/08/02/argumentation-recommendation-ecommerce-knowledge-graphs">semantic algebra</a> and computational problem solving. For example, consider the request: ‘show me a movie like Casablanca but scarier as Shining’ will consider the properties of Casablanca but with the <code class="language-plaintext highlighter-rouge">scary</code> domain similar to Shining. Put it in the frame of <a href="/tags/#logic-programming">logic programming</a>, and you get it in a line of code.</p>

<h2 id="cognitive-affinity-of-cognitive-spaces">Cognitive affinity of cognitive spaces</h2>
<p>CS has also some interesting psychological foundations in the regards of how people deal with inner knowledge representations and how they learn them. It has proven some validity in explaining some cognitive aspects, especially those involved in concept learning and understanding. It has been found out that when children have assimilated the meaning of a domain, it’s then easy to learn concepts that represent a flavoured materialisation of that domain. For example, once they know what the domain of ‘colour’ is, it’s easy to learn new concepts related to colour, such as ‘turquoise’. Grasping a new domain is a much more difficult step than adding new terms to an already established one. Conceptual domains are mental buckets where we place concepts based on how their properties fit into that domain and we don’t have to know how ‘turquoise’ is exactly encoded, we just need to think of it in comparison to other concepts as somewhere in the between of light blue and light green. Seems to be a provable trick we use on learning and it is justified by the principle of cognitive economy, for which our mental capabilities are limited and we favour simple and efficient ways to position new information.</p>

<p>I’ve written more about cognitive aspects in <a href="/2022/11/15/meeting-of-minds">meeting of minds</a>. Take a look at there!</p>]]></content><author><name>Giancarlo Frison</name></author><category term="neurosymbolic" /><category term="logic programming" /><category term="neural networks" /><summary type="html"><![CDATA[Conceptual spaces share similarity with neural embeddings because they are both differentiable and convex. At the same time, concepts incorporate symbolism with their multi-facets domains. They naturally enable semantic algebra and computational problem solving.]]></summary></entry></feed>