About: In this paper, we present a new program synthesis algorithm based on reinforcement learning. Given an initial policy (i.e. statistical model) trained off-line, our method uses this policy to guide its search and gradually improves it by leveraging feedback obtained from a deductive reasoning engine. Specifically, we formulate program synthesis as a reinforcement learning problem and propose a new variant of the policy gradient algorithm that can incorporate feedback from a deduction engine into the underlying statistical model. The benefit of this approach is two-fold: First, it combines the power of deductive and statistical reasoning in a unified framework. Second, it leverages deduction not only to prune the search space but also to guide search. We have implemented the proposed approach in a tool called Concord and experimentally evaluate it on synthesis tasks studied in prior work. Our comparison against several baselines and two existing synthesis tools shows the advantages of our proposed approach. In particular, Concord solves 15% more benchmarks compared to Neo, a state-of-the-art synthesis tool, while improving synthesis time by 8.71[Formula: see text] on benchmarks that can be solved by both tools.

Facets (new session)
Description
Metadata
Settings
- owl:sameAs
- Inference Rule:

About: In this paper, we present a new program synthesis algorithm based on reinforcement learning. Given an initial policy (i.e. statistical model) trained off-line, our method uses this policy to guide its search and gradually improves it by leveraging feedback obtained from a deductive reasoning engine. Specifically, we formulate program synthesis as a reinforcement learning problem and propose a new variant of the policy gradient algorithm that can incorporate feedback from a deduction engine into the underlying statistical model. The benefit of this approach is two-fold: First, it combines the power of deductive and statistical reasoning in a unified framework. Second, it leverages deduction not only to prune the search space but also to guide search. We have implemented the proposed approach in a tool called Concord and experimentally evaluate it on synthesis tasks studied in prior work. Our comparison against several baselines and two existing synthesis tools shows the advantages of our proposed approach. In particular, Concord solves 15% more benchmarks compared to Neo, a state-of-the-art synthesis tool, while improving synthesis time by 8.71[Formula: see text] on benchmarks that can be solved by both tools. Goto Sponge NotDistinct Permalink

An Entity of Type : fabio:Abstract, within Data Space : covidontheweb.inria.fr associated with source document(s)

Attributes	Values
type	abstract
value	In this paper, we present a new program synthesis algorithm based on reinforcement learning. Given an initial policy (i.e. statistical model) trained off-line, our method uses this policy to guide its search and gradually improves it by leveraging feedback obtained from a deductive reasoning engine. Specifically, we formulate program synthesis as a reinforcement learning problem and propose a new variant of the policy gradient algorithm that can incorporate feedback from a deduction engine into the underlying statistical model. The benefit of this approach is two-fold: First, it combines the power of deductive and statistical reasoning in a unified framework. Second, it leverages deduction not only to prune the search space but also to guide search. We have implemented the proposed approach in a tool called Concord and experimentally evaluate it on synthesis tasks studied in prior work. Our comparison against several baselines and two existing synthesis tools shows the advantages of our proposed approach. In particular, Concord solves 15% more benchmarks compared to Neo, a state-of-the-art synthesis tool, while improving synthesis time by 8.71[Formula: see text] on benchmarks that can be solved by both tools.
Subject	Algorithms Mathematical modeling Reinforcement learning Reasoning Markov models Logic Scientific modeling Problem solving skills Belief revision Deductive reasoning
part of	Program Synthesis Using Deduction-Guided Reinforcement Learning
is abstract of	Program Synthesis Using Deduction-Guided Reinforcement Learning
is hasSource of	covid:ann/target/ae1b40602beffc390d70136c2ce00e8870c104f4 covid:ann/target/ff35a83c36b265918de3f613ab3adbaf9fb82036 covid:ann/target/78e0d5b07bb50a3914f26d1c3a738297396ed682 covid:ann/target/d495b351191634987f515fa2ce88db7d260a14d3 covid:ann/target/11458cd12ba5261b4847f10b9acb5e5a34f776d8 covid:ann/target/d2fac677c8f9a6f8946a88648b46203e9db9d283 covid:ann/target/170f5e98d507e1ab209e76e3faaa5b6f750dc678 covid:ann/target/2ae0d70f537ca52426f215b457cada8385b7cedd covid:ann/target/67d6eaf89ad5edea4fe92f0cdcb0b22392fdbadc covid:ann/target/ebe0b1746a103e37ff45bacebfeb8b731e24911d covid:ann/target/b30368bc6a2636e0a559b01b0f698bfb74736f64 covid:ann/target/2f4cfc373621e02af89c3e62aea2baeee30f7588 covid:ann/target/e963d6824d08582bfdb54fcec45e4be2bb8a9165 covid:ann/target/be0af2b01de92e28a93ada4a51281ec08b4a1a47 covid:ann/target/ebe5298ab8a0a06d5851ce44fa1586556ebe4870 covid:ann/target/b493b391a6bb5ab34b69c529d8cce979a4d83558 covid:ann/target/ecea21170e13e215eab91cc80f97a15a6099b8a3 covid:ann/target/45fdeb2036c952ed5e0058a7fc254a91622e0d90 covid:ann/target/6d399f2ea40330bff49b728d12830054737ad63f covid:ann/target/b78a2c012a21c6745530be834b81a5c646c17944 covid:ann/target/dcfba41c217b98bc9a76d42bfb3042100a5ec4e9 covid:ann/target/9f6325b77c5d35cda7fc7e52bfc14a1516829eb2 covid:ann/target/cab95dc23584b4db37ff728dcaad4c84d1e6d47a covid:ann/target/f2bdae2b9681f09c233b1a1ed5cca17f447dbc65 covid:ann/target/012bed41dd04f5ad17d0b6bd462581985a358342 covid:ann/target/55fdbf5ffb08adf419672e294c9c83cb2e760d33 covid:ann/target/a766ac0f943c272832abb50d56e075880f01dfcd covid:ann/target/7def9025dc7769324302d5575e8ba8d9764f1dfd covid:ann/target/8248ab418dfe0e0956bc029a399cf13b06424b7b covid:ann/target/abefe8adf534857b61bc663ad336839219597e08 covid:ann/target/8ec296c020a6620fcd8263d4cc3214bdc0ed66a8 covid:ann/target/b0af0d84471037ca3027e3d782fd8d26e6d5d9ef covid:ann/target/4e10212fe3a9d8b6fab90e679d304adc8a6d6c00 covid:ann/target/60a102c75bd2c831d13192928b40913fc8ee88f7 covid:ann/target/2dbd1afece878637e238cccf38715298024e323e covid:ann/target/928bcb4c536b56e414f9a68d43ee330f68510d96 covid:ann/target/ec99d43fd1e6db9da463cf5f2ad2ace92be0890a covid:ann/target/648d85c9c84e93c582c1341986941a51a313a7ba covid:ann/target/bb52c94dec40ae253200a55187eed20eeed06f70 covid:ann/target/c1d3a1a5eba892856de92f6fc386c0fa2611cd2d covid:ann/target/ff8c08a7b7614cb33a52360eb63ff095c5f7f029 covid:ann/target/0a581c50cddbe904b62b381d7f13cbcd88700bc4 covid:ann/target/2dddc1b6f6fb32d8a9b69bb3335e620130cfcae5 covid:ann/target/2e6a89a1f290ead7795d827a64baf372f32b060b covid:ann/target/466b9440f021ac789263d13b12ad571349cf924b covid:ann/target/a427305613381ba69f29a5036989e4baf3205128 covid:ann/target/d938b59a8ed633d0722b60377e36588ccf768ef4 covid:ann/target/180355d8217608b708dec92e6b4caacf0085c6cc covid:ann/target/1b52dd0c3c2a2c0d0b779e3ced38531f937e23c0 covid:ann/target/351a0499220890a2a6e496a2b19f6be834b7303d covid:ann/target/72e2c316de6104736203f88422f4b8bec40399b4 covid:ann/target/d228638ce2f7af54845b7fd9f756baf13231453d covid:ann/target/6646cb58fc9eb58678c5645e3bc437584116fc8b covid:ann/target/4c87c13270687211fe53f4cbb617c10cc1c6ca36 covid:ann/target/73ce36614b8deedaf86bb90972324f2e336b6275 covid:ann/target/f9000ab7668d41e7c0dbeae1a377526b4f4dcbc7 covid:ann/target/62058654658706b6ba1515a08b864ddebf298640 covid:ann/target/4ac8cc0e6813af2a2622c46620392f8498e5ab46 covid:ann/target/a7d77223f672070f8bf9d806a5d27368fbbf1ced covid:ann/target/7da68111ffb11bd7b8c36619744ab00197b0a3af covid:ann/target/54b59c66c111657f39585f161a9d92b98c717987 covid:ann/target/caa66a27c0d27107b8e73a700e8a18bc976cd3ae covid:ann/target/fec6aec5d325944ae44722e4436524d726223b4b covid:ann/target/7676fb6de0c150790a54bbed8d05e0831280883e covid:ann/target/cee673361410ff6e709742e649cc0cff7551b897 covid:ann/target/e822a970438533b585b876827c9e0bbfcde1c1ab covid:ann/target/5416dd6db2a916fdb659c5e1fad022b6f3bcd1e1 covid:ann/target/5d79bf7759db1998a1b2a60f8ca3f6994bc7e727 covid:ann/target/015bb261515756b091dcd0673009b4f58ac0d2c0 covid:ann/target/fd187c78d4653fb19d065f10dbed5a39eac6578e covid:ann/target/54deae34bb1a18242bf9dd4b6fab01ffad9da351 covid:ann/target/2c338772a4170593c374c9d8594176e855e0cfc9 covid:ann/target/ccc9ddcbbf1ef5351abdc5d8459ef0ddb43b2081 covid:ann/target/5e067a8728a6a0b53e1cba92cf9e832a919ca506 covid:ann/target/0586d05811b1a4b598d736063d167e6088538a66 covid:ann/target/2b54d73dbeaabbe69629bb50ec0bd72ee253814b covid:ann/target/26719f69ba8f3b4f3c5948eea00f191db424748a covid:ann/target/2bd8f93f704d8c6b470026418fb91e7deb94c2c1 covid:ann/target/933450b34ab663b67eeae90ce0da16122ee07946 covid:ann/target/3a3d3844d9808c0a25a6a6838f8fadfffa22d571 covid:ann/target/32462bb014b9bb2969b0d72935e28c7b72c6c2a7 covid:ann/target/ad6e2838f059282e2c2acb7a165f61c91d2b9413 covid:ann/target/c6a8833a7c724708b346eade3f7496569024b306 covid:ann/target/5e9ced97b4db3c7bfd314fb70a27c26020846181 covid:ann/target/2609e3065b9abf5691ca5d3c8fcf365e5ad01b8f covid:ann/target/4dec540f1811774fbfde3dffe54178a83af7d0d9

Faceted Search & Find service v1.13.91 as of Mar 24 2020

Alternative Linked Data Documents: Sponger | ODE Content Formats:

RDF

ODATA

Microdata

About

OpenLink Virtuoso version 07.20.3229 as of Jul 10 2020, on Linux (x86_64-pc-linux-gnu), Single-Server Edition (94 GB total memory)
Data on this page belongs to its respective rights holders.
Virtuoso Faceted Browser Copyright © 2009-2024 OpenLink Software