To make throttling by label fully active, the "throttle" option
has to be specified with a specific label.
You can now specify "skip_gate" in the Gerrit comments when you
do a +2 code review to tell Jenkins not to actually run the
gate. You'd do this if you plan to manually merge the change.
Also updated the "printenv" debug output to better sort multi-line
comments.
Change-Id: I4c0b1085acec4805f2ca207eebac50aad81f27e2
Originally, the eligible nodes for a job were labelled only by
"swdev-docker". So basically any node could run any job. We had
found that allowing a node to run more than 1 gate at a time was
problematic so we limited the nodes to processing 1 job at a time.
With the creation of the Asterisk 17 branches however, we now have
so many active branches that getting checks and gates through in
a timely manner is problematic when a node can run only 1 job
at a time.
Now the nodes are also labelled by the job type they can run.
For instance: "asterisk-check", "asterisk-gate", etc. With the
"Throttle Concurrent Builds" plugin, we can now allow a node to
run more than 1 job BUT throttle by job type. For instance:
Allow 2 jobs but only 1 asterisk-gate at a time.
Now a node can run 2 checks or 1 check and 1 gate or 1 gate but
not 2 gates at a time.
Change-Id: I2032bf6afbcec5c341d9b852214c0c812d3d6db5
We don't support non-core modules for Certified releases but we
were enabling them for CI builds which was causing lots of test
failures. Now we don't.
Change-Id: I0b3254c08a2479f3d39151690350cce5ce5ad766
We're at the point where there are enough Jenkins jobs for
Asterisk branches than even cleaned checkouts of Asterisk
will add up to more disk space than is available on the
in-memory workspace mount. Since we archive all relevent
artifacts anyway, there's no need to keep the workspace
around after the job finishes, whether it succeeds or fails.
Change-Id: I1cd3b73ebb045a987df0f62526d152a510210c39
** Note **
This patch is meant to be the minimum needed in order for the MWI core to use
the now underlying stasis_state module. As such it does not completely remove
its reliance on the stasis_cache. Doing so has allowed current consumers to
not have to change, and update those code paths for this patch. When time
allows, subsequent patches can/will be made to those consumers to take advantage
of some of the new MWI API included here. Thus, eventually and ultimately
removing MWI dependency on the stasis_cache.
** End Note **
This patch makes it so the MWI core now takes advantage of the new stasis_state
API. Consumers of MWI should no longer need to depend upon stasis topic pooling,
and the stasis cache directly. Similar functionality and implementation details
have now been pushed into the stasis_state module. However, all MWI state should
be accessed via the MWI API itself.
As such a few new methods, and constructs have been added to the MWI core that
facilitate consumer publishing, subscribing, and iterating over MWI state data.
* ast_mwi_subscriber *
Created via ast_mwi_add_subscriber, a subscriber subscribes to a given mailbox
in order to receive updates about the given mailbox. Adding a subscriber will
create the underlying topic, and associated state data if those do not already
exist for it. The topic, and last known state data is guaranteed to exist for
the lifetime of the subscriber.
* ast_mwi_publisher *
Before publishing to a particular topic a publisher should be created. This can
be achieved by using ast_mwi_add_publisher. Publishing to a mailbox should then
be done using one of the MWI publish functions. This ensures the message is
published to the appropriate topic, and the last known state is maintained.
* ast_mwi_observer *
Add an observer in order to watch for particular MWI module related events. For
instance if a submodule needs to know when a subscription is added to any
mailbox an observer can be added to watch for that.
* other *
Urgent message count is now part of the published MWI state object. Also state
can be iterated over using defined callbacks.
ASTERISK-28442
Change-Id: I93f935f9090cd5ddff6d4bc80ff90703c05cf776
This new module describes an API that can be thought of as a combination of
stasis topic pools, and caching. Except, hopefully done in a more efficient
and less memory "leaky" manner.
The API defines methods, and data structures for managing, and tracking
published message state through stasis. By adding a subscriber or publisher,
consumers can more easily track the lifetime of the contained state. For
instance, when no more publishers and/or subscribers have need of the topic,
and associated state its data is removed from the managed container.
* stasis_state_manager *
The manager stores and well, manages state data. Each state is an association
of a unique stasis topic, and the last known published stasis message on that
topic. There is only ever one managed state object per topic. For each topic
all messages are forwarded to an "all" topic also maintained by the manager.
* stasis_state_subscriber *
Topic and state can be created, or referenced within the manager by adding a
stasis_state_subscriber. When adding a subscriber if no state currently exists
new managed state is immediately created. If managed state already exists then
a new subscriber is created referencing that state. The managed state is
guaranteed to live throughout the subscriber's lifetime. State is only removed
from the manager when no other entities require it.
* stasis_state_publisher *
Topic and state can be created, or referenced within the manager by also adding
a stasis_state_publisher. When adding a publisher if no state currently exists
new managed state is created. If managed state already exists then a new
publisher is created referencing that state. The managed state is guaranteed to
live throughout the publisher's lifetime. State is only removed from the
manager when no other entities require it.
* stasis_state_observer *
Some modules may wish to watch for, and react to managed state events. By
registering a state observer, and implementing handlers for the desired
callbacks those modules can do so.
* other *
Callbacks also exist that allow consumers to iterate over all, or some of the
managed state.
ASTERISK-28442
Change-Id: I7a4a06685a96e511da9f5bd23f9601642d7bd8e5
We were using the presence of /usr/lib64 to determine where
shared libraries should be installed. This only existed on
Redhat based systems and was safe. If it existed, use it,
otherwise use /usr/lib.
Unfortunately, Ubuntu 19 decided to create a /usr/lib64 BUT
NOT INCLUDE IT IN THE DEFAULT ld.so.conf. So if anything is
installed there, it won't work.
The new method, just looks for $ID in /etc/os-release and if it's
centos or fedora, uses /usr/lib64 and if ubuntu, uses /usr/lib.
NOTE: This applies only to the CI scripts. Normal asterisk
build and install is not affected.
Change-Id: Iad66374b550fd89349bedbbf2b93f8edd195a7c3
This patch adds basic Asterisk channel statistics to the res_prometheus
module. This includes:
* asterisk_calls_sum: A running sum of the total number of
processed calls
* asterisk_calls_count: The current number of calls
* asterisk_channels_count: The current number of channels
* asterisk_channels_state: The state of any particular channel
* asterisk_channels_duration_seconds: How long a channel has existed,
in seconds
In all cases, enough information is provided with each channel metric
to determine a unique instance of Asterisk that provided the data, as
well as the name, type, unique ID, and - if present - linked ID of each
channel.
ASTERISK-28403
Change-Id: I0db306ec94205d4f58d1e7fbabfe04b185869f59
Prometheus is the defacto monitoring tool for containerized applications.
This patch adds native support to Asterisk for serving up Prometheus
compatible metrics, such that a Prometheus server can scrape an Asterisk
instance in the same fashion as it does other HTTP services.
The core module in this patch provides an API that future work can build
on top of. The API manages metrics in one of two ways:
(1) Registered metrics. In this particular case, the API assumes that
the metric (either allocated on the stack or on the heap) will have
its value updated by the module registering it at will, and not
just when Prometheus scrapes Asterisk. When a scrape does occur,
the metrics are locked so that the current value can be retrieved.
(2) Scrape callbacks. In this case, the API allows consumers to be
called via a callback function when a Prometheus initiated scrape
occurs. The consumers of the API are responsible for populating
the response to Prometheus themselves, typically using stack
allocated metrics that are then formatted properly into strings
via this module's convenience functions.
These two mechanisms balance the different ways in which information is
generated within Asterisk: some information is generated in a fashion
that makes it appropriate to update the relevant metrics immediately;
some information is better to defer until a Prometheus server asks for
it.
Note that some care has been taken in how metrics are defined to
minimize the impact on performance. Prometheus's metric definition
and its support for nesting metrics based on labels - which are
effectively key/value pairs - can make storage and managing of metrics
somewhat tricky. While a naive approach, where we allow for any number
of labels and perform a lot of heap allocations to manage the information,
would absolutely have worked, this patch instead opts to try to place
as much information in length limited arrays, stack allocations, and
vectors to minimize the performance impacts of scrapes. The author of
this patch has worked on enough systems that were driven to their knees
by poor monitoring implementations to be a bit cautious.
Additionally, this patch only adds support for gauges and counters.
Additional work to add summaries, histograms, and other Prometheus
metric types may add value in the future. This would be of particular
interest if someone wanted to track SIP response types.
Finally, this patch includes unit tests for the core APIs.
ASTERISK-28403
Change-Id: I891433a272c92fd11c705a2c36d65479a415ec42
Added a conversion for umax (largest maximum sized integer allowed). Adjusted
the other current conversion functions (uint and ulong) to be derivatives of
the umax conversion since they are simply subsets of umax.
Also made the negative check move the pointer on spaces since strtoumax does it
anyways.
Change-Id: I56c2ef2629d49b524c8df58af12951c181f81f08
One of the downaides of having things like test configuration
in the git repo is that it can't be changed at runtime. You have
to create a review for the changes and merge it mefore it will
take effect.
This review moves the data currently held in
tests/CI/periodic-dailyTestGroups.json and
tests/CI/gateTestGroups.json into a Jenkins Config File attached
to the job definitions. This allows us to alter it from the
Jenkins UI at runtime. The original files stay in the repo
as documentation.
Change-Id: I14b9702f6285ce1fb2420287ba0e7d3b59109763
The new option disables dev mode, TEST_FRAMEWORK and
MALLOC_DEBUG making the build more production-like.
Change-Id: Ieb72497d4d91d5416684aaed702cc3f532099738
It was difficult to check the channel's current application and
parameters using ARI for current channels. Added app_name, app_data
items to show the current application information.
ASTERISK-28343
Change-Id: Ia48972b3850e5099deab0faeaaf51223a1f2f38c
Added ability to specifiy a wizard is read-only when applying
it to a specific object type. This allows you to specify
create, update and delete callbacks for the wizard but limit
which object types can use them.
Added the ability to allow an object type to have multiple
wizards of the same type. This is indicated when a wizard
is added to a specific object type.
Added 3 new sorcery wizard functions:
* ast_sorcery_object_type_insert_wizard which does the same thing
as the existing ast_sorcery_insert_wizard_mapping function but
accepts the new read-only and allot-duplicates flags and also
returns the ast_sorcery_wizard structure used and it's internal
data structure. This allows immediate use of the wizard's
callbacks without having to register a "wizard mapped" observer.
* ast_sorcery_object_type_apply_wizard which does the same
thing as the existing ast_sorcery_apply_wizard_mapping function
but has the added capabilities of
ast_sorcery_object_type_insert_wizard.
* ast_sorcery_object_type_remove_wizard which removes a wizard
matching both its name and its original argument string.
* The original logic in __ast_sorcery_insert_wizard_mapping was moved
to __ast_sorcery_object_type_insert_wizard and enhanced for the
new capabilities, then __ast_sorcery_insert_wizard_mapping was
refactored to just call __ast_sorcery_insert_wizard_mapping.
* Added a unit test to test_sorcery.c to test the read-only
capability.
Change-Id: I40f35840252e4313d99e11dbd80e270a3aa10605
Changed to requirement to having timestamp for all of ARI events.
The below ARI events were changed to having timestamp.
PlaybackStarted, PlaybackContinuing, PlaybackFinished,
RecordingStarted, RecordingFinished, RecordingFailed,
ApplicationReplaced, ApplicationMoveFailed
ASTERISK-28326
Change-Id: I382c2fef58f5fe107e1074869a6d05310accb41f
The recent upgrade of Gerrit to 2.16 elimiated referencing a
repository in a way the jenkinsfiles were relying on so
the URL references were changed to a more consistent and supported
format.
Change-Id: I2e8e3f213b9a96bb1b27665eca4a9a24bc49820e
(cherry picked from commit 5ce084579f)
To prevent one subsystem's taskprocessors from causing others
to stall, new capabilities have been added to taskprocessors.
* Any taskprocessor name that has a '/' will have the part
before the '/' saved as its "subsystem".
Examples:
"sorcery/acl-0000006a" and "sorcery/aor-00000019"
will be grouped to subsystem "sorcery".
"pjsip/distributor-00000025" and "pjsip/distributor-00000026"
will bn grouped to subsystem "pjsip".
Taskprocessors with no '/' have an empty subsystem.
* When a taskprocessor enters high-water alert status and it
has a non-empty subsystem, the subsystem alert count will
be incremented.
* When a taskprocessor leaves high-water alert status and it
has a non-empty subsystem, the subsystem alert count will be
decremented.
* A new api ast_taskprocessor_get_subsystem_alert() has been
added that returns the number of taskprocessors in alert for
the subsystem.
* A new CLI command "core show taskprocessor alerted subsystems"
has been added.
* A new unit test was addded.
REMINDER: The taskprocessor code itself doesn't take any action
based on high-water alerts or overloading. It's up to taskprocessor
users to check and take action themselves. Currently only the pjsip
distributor does this.
* A new pjsip/global option "taskprocessor_overload_trigger"
has been added that allows the user to select the trigger
mechanism the distributor uses to pause accepting new requests.
"none": Don't pause on any overload condition.
"global": Pause on ANY taskprocessor overload (the default and
current behavior)
"pjsip_only": Pause only on pjsip taskprocessor overloads.
* The core pjsip pool was renamed from "SIP" to "pjsip" so it can
be properly grouped into the "pjsip" subsystem.
* stasis taskprocessor names were changed to "stasis" as the
subsystem.
* Sorcery core taskprocessor names were changed to "sorcery" to
match the object taskprocessors.
Change-Id: I8c19068bb2fc26610a9f0b8624bdf577a04fcd56
Some tests require Asterisk to execute scripts which
are stored in /tmp. When mount is used for tmpfs there
is no ability to allow scripts to be executed from
that location.
This change switches to using tmpfs which can be told
to allow executables to be run from /tmp.
Change-Id: I0e598ca2b76af1f7f2d29f0da7b1731a214a291a
This change makes it so that even if non-code changes
occur (such as commit message changing) unit tests
will still be run and result in a verification.
ASTERISK-28251
Change-Id: I6491fff7c93e5d5cd8e41054486968bf66c4f608
A subscriber can now indicate that it only wants messages
that have formatters of a specific type. For instance,
manager can indicate that it only wants messages that have a
"to_ami" formatter. You can combine this with the existing
filter for message type to get only messages with specific
formatters or messages of specific types.
ASTERISK-28186
Change-Id: Ifdb7a222a73b6b56c6bb9e4ee93dc8a394a5494c
* Added ---no-configure, --no-menuselect, --no-make and --no-alembic
options that prevent those actions from being performed. Useful
for testing and re-running portions of the build after fixing
earlier failures.
* Added "set -e" to abort the script on command failure.
Not sure why this wasn't there in the first place.
* Fixed a few echos that were redirecting to stderr when they shouldn't
have been.
* Catch more alembic failures by actually trying to generate the SQL.
Change-Id: I9f395fa4e9254be7299e7c1014f1a13db78faffb
This test was occasionally failing, with:
WARNING[5812]: http.c:1939 httpd_helper_thread: Failed to set
TCP_NODELAY on HTTP connection: Bad file descriptor
ERROR[5812]: iostream.c:91 ast_iostream_nonblock: Failed to get
fcntl() flags for file descriptor: Bad file descriptor
ERROR[5812]: iostream.c:569 ast_iostream_close: close() failed: Bad
file descriptor
Disabled for now by making the test explicit only.
Change-Id: I778f6cbb6104c6b4e89737a2eaf1a9540888d351
These are only a few of the leaks. The large number of macros
and return paths in this file would make a weeks worth of work
to plug them all.
Change-Id: Ie2369fa944023d44767871c5c30974cb077ffb56
* The bridging core no longer uses the stasis cache for bridge
snapshots. The latest bridge snapshot is now stored on the
ast_bridge structure itself.
* The following APIs are no longer available since the stasis cache
is no longer used:
ast_bridge_topic_cached()
ast_bridge_topic_all_cached()
* A topic pool is now used for individual bridge topics.
* The ast_bridge_cache() function was removed since there's no
longer a separate container of snapshots.
* A new function "ast_bridges()" was created to retrieve the
container of all bridges. Users formerly calling
ast_bridge_cache() can use the new function to iterate over
bridges and retrieve the latest snapshot directly from the
bridge.
* The ast_bridge_snapshot_get_latest() function was renamed to
ast_bridge_get_snapshot_by_uniqueid().
* A new function "ast_bridge_get_snapshot()" was created to retrieve
the bridge snapshot directly from the bridge structure.
* The ast_bridge_topic_all() function now returns a normal topic
not a cached one so you can't use stasis cache functions on it
either.
* The ast_bridge_snapshot_type() stasis message now has the
ast_bridge_snapshot_update structure as it's data. It contains
the last snapshot and the new one.
* cdr, cel, manager and ari have been updated to use the new
arrangement.
Change-Id: I7049b80efa88676ce5c4666f818fa18ad1985369
When a channel snapshot was created it used to be done
from scratch, copying all data (many strings). This incurs
a cost when doing so.
This change segments the channel snapshot into different
components which can be reused if unchanged from the
previous snapshot creation, reducing the cost. In normal
cases this results in some pointers being copied with
reference count being bumped, some integers being set,
and a string or two copied. The other benefit is that it
is now possible to determine if a channel snapshot update
is redundant and thus stop it before a message is published
to stasis.
The specific segments in the channel snapshot were split up
based on whether they are changed together, how often they
are changed, and their general grouping. In practice only
1 (or 0) of the segments actually get changed in normal
operation.
Invalidation is done by setting a flag on the channel when
the segment source is changed, forcing creation of a new
segment when the channel snapshot is created.
ASTERISK-28119
Change-Id: I5d7ef3df963a88ac47bc187d73c5225c315f8423
Channels no longer use the Stasis cache for channel snapshots. Instead
they are stored in a hash table in stasis_channels which reduces the
number of Stasis messages created and allows better storage.
As a result the following APIs are no longer available since the stasis
cache is no longer used:
ast_channel_topic_cached()
ast_channel_topic_all_cached()
The ast_channel_cache_all() and ast_channel_cache_by_name() functions
now return an ao2_container of ast_channel_snapshots rather than
a container of stasis_messages therefore you can't (and don't need
to) call stasis_cache functions on it.
The ast_channel_topic_all() function now returns a normal topic not
a cached one so you can't use stasis cache functions on it either.
The ast_channel_snapshot_type() stasis message now has the
ast_channel_snapshot_update structure as it's data. It contains the
last snapshot and the new one.
ast_channel_snapshot_get_latest() still returns the latest snapshot.
The latest snapshot is now stored on the channel itself to eliminate
cache hits when Stasis messages that have the snapshot as a payload
are created.
ASTERISK-28102
Change-Id: I9334febff60a82d7c39703e49059fa3a68825786
Replace usage of ao2_container_alloc with ao2_container_alloc_hash or
ao2_container_alloc_list. Remove ao2_container_alloc macro.
Change-Id: I0907d78bc66efc775672df37c8faad00f2f6c088
Create ao2_container_dup_weakproxy_objs to perform a similar function to
ao2_container_dup. This function expects the source container to have
weakproxy objects, inserts the associated non-weak objects into the
destination container. Orphaned weakproxy objects are ignored.
Create test for this new function and for ao2_weakproxy_find.
Change-Id: I898387f058057e08696fe9070f8cd94ef3a27482
The job timeouts were hard coded in the jenkinsfiles which
means changes had to go through gerrit. Now they are taken
from the following environment variables (and their defaults) that
can be set in Jenkins configuration...
TIMEOUT_GATES = "60 MINUTES"
TIMEOUT_DAILIES = "3 HOURS"
TIMEOUT_REF_DEBUG = "24 HOURS"
TIMEOUT_UNITTESTS = "30 MINUTES"
Change-Id: I673a551c1780bf665a3bc160b245da574aa4bbab
We've been seeing crashes in libbfd when we attempt to generate
a stack trace from multiple threads. It turns out that libbfd
is NOT thread-safe. It can cache the bfd structure and give it to
multiple threads without protecting itself. To get around this,
we've added a global mutex around the bfd functions and also have
refactored the use of those functions to be more efficient and
to provide more information about inlined functions.
Also added a few more tests to test_pbx.c. One just calls
ast_assert() and the other calls ast_log_backtrace(). Neither are
run by default.
WARNING: This change necessitated changing the return value of
ast_bt_get_symbols() from an array of strings to a VECTOR of
strings. However, the use of this function outside Asterisk is not
likely.
ASTERISK-28140
Change-Id: I79d02862ddaa2423a0809caa4b3b85c128131621
The testsuite can now use a user-specified work directory for
all it's temp files. This allows the docker containers to use
a tmpfs backed directory for the temp files instead of it's
own write-layer image.
* runTestsuite.sh now accepts a --work-dir command line argument
that gets exported as AST_WORK_DIR before running the testsuite.
* gates.jenkinsfile now specifies --work-dir to be
<testsuite_dir>/astroot.
Since the Asterisk CI docker hosts now mount /srv/jenkins/workspace
on a tmpfs, asterisk should be compiled and the testsuite run all in
memory.
Change-Id: If5ee905a15821296c355bb84cda38950ad8edc45
(cherry picked from commit a335f4c9ad)
There seems to be a race condition between starting the asterisk
daemon and attempting to use 'asterisk -r' that can cause the
control socket file to not be created. Since all of the Jenkins
slaves have 'expect' installed, the runUnittests script can use
it to start asterisk in the forground and issue the commands
interactively. This is much more reliable and it can also make
startup errors more visible since they'll be in the Jenkins console
output.
If 'expect' isn't installed, the original daemon/asterisk -r
process is used.
Also added a "core show settings" before running the tests
and added "notice,warning,error" to the console log.
Change-Id: Idd656085f854afede813ac241b9e312b31358160
It's possible for a 4th task to be spawned before we cancel. This
results in a write to the already freed test_data1. Wait long enough to
verify success of the cancelation before freeing test_data1.
Change-Id: I057e2fcbe97f8a175e50890be89c28c20490a20f
These macros have been documented as legacy for a long time but are
still used in new code because they exist. Remove all references to:
* ao2_container_alloc_options
* ao2_t_container_alloc_options
* ao2_t_container_alloc
These macro's are also removed. Only ao2_container_alloc remains due to
it's use in over 100 places.
Change-Id: I1a26258b5bf3deb081aaeed11a0baa175c933c7a
Add attribute_warn_unused_result to ast_taskprocessor_push,
ast_taskprocessor_push_local and ast_threadpool_push. This will help
ensure we perform the necessary cleanup upon failure.
Change-Id: I7e4079bd7b21cfe52fb431ea79e41314520c3f6d