Commit Graph

5653 Commits

Author SHA1 Message Date
Joshua Colp
4efc6b4315 Merge changes from topic 'system_stress_patches' into 13
* changes:
  Bridge system: Fix memory leaks and double frees on impart failure.
  bridge_softmix.c: Fix crash if channel fails to join mixing tech.
2016-04-26 04:56:36 -05:00
George Joseph
b38f1146e5 config: Fix ast_config_text_file_save2 writability check for missing files
A patch I did back in 2014 modified ast_config_text_file_save2 to check the
writability of the main file and include files before truncating and re-writing
them.  An unintended side-effect of this was that if a file doesn't exist,
the check fails and the write is aborted.

This patch causes ast_config_text_file_save2 to check the writability of the
parent directory of missing files instead of checking the file itself.  This
allows missing files to be created again.  A unit test was also added to
test_config to test saving of config files.

The regression was discovered when app_voicemail's passwordlocation=spooldir
feature stopped working.

ASTERISK-25917 #close
Reported-by: Jonathan Rose

Change-Id: Ic4dbe58c277a47b674679e49daed5fc6de349f80
2016-04-25 18:16:58 -05:00
zuul
09f8f8daa1 Merge "bridge: Hold off more than one imparting channel at a time." into 13 2016-04-22 18:29:19 -05:00
Richard Mudgett
ba63aa7c9e Manager: Short circuit AMI message processing.
Improve AMI message processing performance if there are no consumers
listening for the messages.  We now skip creating the AMI event message
text strings.

Change-Id: I7b22fc5ec4e500d00635c1a467aa8ea68a1bb2b3
2016-04-22 16:44:05 -05:00
Richard Mudgett
d5ee6acf28 manager.c: Eliminate most RAII_VAR usage.
* Made ast_manager_event_blob_create() not allocate the ao2 event object
with a lock as it is not needed.

Change-Id: I8e11bfedd22c21316012e0b9dd79f5918f644b7c
2016-04-22 16:44:05 -05:00
Richard Mudgett
7303e3dc96 manager_channels.c: Fix allocation failure crash.
An earlier allocation failure failed to create a channel snapshot for the
AMI HangupRequest/SoftHangupRequest event which resulted in a crash in
channel_hangup_request_cb().  Where the stasis message gets generated
cannot tell if the NULL snapshot returned was because of an allocation
failure or the channel was a dummy channel.

* Made channel_hangup_request_cb() check if the channel blob has a
snapshot and exit if it doesn't.

* Eliminated the RAII_VAR usage in channel_hangup_request_cb().

Change-Id: I0b6a1c4e95cbb7d80b2a7054c6eadecc169dfd24
2016-04-22 16:44:05 -05:00
Richard Mudgett
1e93f3d723 Bridge system: Fix memory leaks and double frees on impart failure.
You cannot reference the passed in features struct after calling
ast_bridge_impart().  Even if the call fails.

Change-Id: I902b88ba0d5d39520e670fb635078a367268ea21
2016-04-22 16:44:04 -05:00
Richard Mudgett
5e388d4188 bridge_softmix.c: Fix crash if channel fails to join mixing tech.
softmix_bridge_join() failed because of an allocation failure.  To address
this, the softmix bridge technology now checks if the channel failed to
join softmix successfully.  In addition, the bridge now begins the process
of kicking the channel out of the bridge so we don't have channels
partially in the bridge for very long.

* Fix the test_channel_feature_hooks.c unit tests.  The test channel must
have a valid codec to join the simple_bridge technology.  This patch makes
joining a bridge more strict by not allowing partially joined channels to
remain in the bridge.

Change-Id: I97e2ade6a2bcd1214f24fb839fda948825b61a2b
2016-04-22 16:44:04 -05:00
Diederik de Groot
e750ea9b5b lock.c: Check *lt before dereferencing it
*lt is NULL if t->tracking == 0

ASTERISK-25948 #close

Change-Id: I4a81af28f9c82a74aa82413d772a7dc8fa6f45ba
2016-04-21 11:35:37 -05:00
Richard Mudgett
9942d50aa5 bridge: Hold off more than one imparting channel at a time.
An earlier patch blocked the ast_bridge_impart() call until the channel
either entered the target bridge or it failed.  Unfortuantely, if the
target bridge is stasis and the imprted channel is not a stasis channel,
stasis bounces the channel out of the bridge to come back into the bridge
as a proper stasis channel.  When the channel is bounced out, that
released the block on ast_bridge_impart() to continue.  If the impart was
a result of a transfer, then it became a race to see if the swap channel
would get hung up before the imparted channel could come back into the
stasis bridge.  If the imparted channel won then everything is fine.  If
the swap channel gets hung up first then the transfer will fail because
the swap channel is leaving the bridge.

* Allow a chain of ast_bridge_impart()'s to happen before any are
unblocked to prevent the race condition described above.  When the channel
finally joins the bridge or completely fails to join the bridge then the
ast_bridge_impart() instances are unblocked.

ASTERISK-25947
Reported by: Richard Mudgett

ASTERISK-24649
Reported by: John Bigelow

ASTERISK-24782
Reported by: John Bigelow

Change-Id: I8fef369171f295f580024ab4971e95c799d0dde1
2016-04-20 15:45:38 -05:00
Richard Mudgett
f4693d1897 bridge_channel.c: Ignore role setup failure in channel push.
We have to setup the channel roles after the bridge class push is called
because the bridge class push callback may have set roles on the incoming
channel.  Since we have already partially pushed the channel into the
bridge and reversing what we have already done could be problematic, the
only thing we can do is press on to complete pushing the channel into the
bridge.

* Ignore any channel role setup errors after pushing the channel into a
bridge.  The channel may behave incorrectly in the bridge but we can no
longer abort the push at this time.

Change-Id: I08a97082b729052ee65cdca6bb730cf1289ede00
2016-04-18 10:51:56 -05:00
Joshua Colp
c7732a2600 Merge "Codecs: strip codec name while parsing allow/disallow options" into 13 2016-04-18 05:31:09 -05:00
Alexei Gradinari
64ecd41c8f Codecs: strip codec name while parsing allow/disallow options
Failed registration using PJSIP/Realtime if one of the codec name
in allow/disallow option is wrong or contains space.

This patch strip codec name.

ASTERISK-25914

Change-Id: Ifdf02de94e5ddbce305640f6f0666084a3b9283d
2016-04-11 17:25:08 -04:00
Jaco Kroon
3f6c4667b8 core_unreal: Fix hangupcauses not getting set on Local channels
ASTERISK-25912 #close

Change-Id: I8e72e6894feaf36c9450f2788d205d07baec23aa
2016-04-11 14:55:32 -05:00
zuul
736a2c303d Merge "lock: Add named lock capability" into 13 2016-04-11 12:56:40 -05:00
George Joseph
772ff3048f lock: Add named lock capability
Locking some objects like sorcery objects can be tricky because the underlying
ao2 object may not be the same for all callers.  For instance, two threads that
call ast_sorcery_retrieve_by_id on the same aor name might actually get 2
different ao2 objects if the underlying wizard had to rehydrate the aor from a
database. Locking one ao2 object doesn't have any effect on the other even if
those objects had locks in the first place.

Named locks allow access control by keyspace and key strings.  Now an "aor"
named "1000" can be locked and any other thread attempting to lock "aor" "1000"
will wait regardless of whether the underlying ao2 object is the same or not.
Mutex and rwlocks are supported.

This capability will initially be used to lock an aor when multiple threads may
be attempting to prune expired contacts from it.

Change-Id: If258c0b7f92b02d07243ce70e535821a1ea7fb45
2016-04-08 12:50:58 -06:00
zuul
e7e16b7465 Merge "pbx.c: Minor code rearangements." into 13 2016-04-08 11:18:33 -05:00
Richard Mudgett
82638fb0c7 pbx.c: Minor code rearangements.
* Pull out a loop invariant.

* Convert an else-if ladder to a switch statement.

Change-Id: I0a95cfa9474a4600b9865f7b444534d275b37e95
2016-04-07 17:12:49 -05:00
Richard Mudgett
2ef8a954b3 pbx: Update doxygen for extension state watchers.
Change-Id: Id1403b12136de62a272c01bb355aef65fd2c2d1e
2016-04-07 16:16:22 -05:00
Jacek Konieczny
0735a4d6d7 frame.c: Copy the whole subclass in ast_frdup().
The problem is ast_frdup() does not copy whole frame.subclass for voice,
video and image frames, only the format is copied.  For video frames, the
subclass structure contains the .frame_ending flag used to put the RTP
marker where it needs to be.

ASTERISK-25894 #close

Change-Id: I812ca90e84ed5d4f473b997d0dd0d3c5a915fe33
2016-04-06 11:12:48 -05:00
Joshua Colp
edf2ce2eff Merge "config: Allow filters when appending to a category" into 13 2016-04-05 15:29:14 -05:00
George Joseph
50b0922a22 config: Allow filters when appending to a category
In sorcery based config files where there are multiple categories with the same
name, you can't use the (+) operator to reliably append to a category because
config.c stops looking when it finds the first one with the same name.

Example:

[1000]
type = endpoint

[1000]
type = aor

[1000](+)
authenticate_qualify = yes

This config will fail because config.c appends authenticate_qualify to the
first category it finds, the endpoint, and that's not valid for endpoint.

Solution:

The capability to find a category that contains a certain variable already
exists so the only real change was to parse anything after the '+' that's not a
comma, as a filter string.

[1000]
type = endpoint

[1000]
type = aor

[1000](+type=aor)
authenticate_qualify = yes

This now works as expected.

Although the following example doesn't make any sense for pjsip, you can even
specify multiple filters:

[1000](+type=aor&qualify_frequency=10)

ASTERISK-25868 #close
Reported-by: Nick Repin

Change-Id: I10773da4c79db36fbf1993961992af63d3441580
2016-04-05 11:23:50 -05:00
Joshua Colp
2b84290386 Merge "stringfields: Refactor to allow fields to be added to the end of structures" into 13 2016-04-05 10:10:52 -05:00
Joshua Colp
bb7214180c Merge "res_rtp_asterisk: Use separate SRTP session for RTCP with DTLS" into 13 2016-04-05 05:37:09 -05:00
George Joseph
f6f4cf459f stringfields: Refactor to allow fields to be added to the end of structures
String fields are great, except that you can't add new ones without breaking
ABI compatibility because it shifts down everything else in the structure.
The only alternative is to add your own char * field to the end of the
structure and manage the memory yourself which isn't ideal, especially since
you then can't use the OPT_STRINGFIELD_T type.

Background:

The reason string fields had to be declared inside the
AST_DECLARE_STRING_FIELDS block was to facilitate iteration over all declared
fields for initialization, compare and copy.  Since AST_DECLARE_STRING_FIELDS
declared the pool, then the fields, then the manager, you could use the offsets
of the pool and manager and iterate over the sequential addresses in between to
access the fields. The actual pool, field allocation and field set operations
don't actually care where the field is.  It's just iteration over the fields
that was the problem.

Solution: Extended String Fields

An extended string field is one that is declared outside the
AST_DECLARE_STRING_FIELDS block but still (anywhere) inside the parent
structure.  Other than using AST_STRING_FIELD_EXTENDED instead of
AST_STRING_FIELD, it looks the same as other string fields.  It's storage comes
from the pool and it participates in string field compare and copy operations
peformed on the parent structure. It's also a valid target for the
OPT_STRINGFIELD_T aco option type.

Implementation:

To keep track of the extended fields and make sure that ABI isn't broken, the
existing embedded_pool pointer in the manager structure was repurposed to be a
pointer to a separate header structure that contains the embedded_pool pointer
plus a vector of fields.  The length of the manager structure didn't change and
the embedded_pool pointer isn't used in the macros, only the stringfields C
code.  A side benefit of this is that changing the header structure in the
future won't break ABI.

ast_string_fields_init initializes the normal string fields and appends them to
the vector, and subsequent calls to ast_string_field_init_extended initialize
and append the extended fields. Cleanup, ast_string_fields_cmp, and
ast_string_fields_copy can now work on the vector instead of sequentially
traversing the addresses between the pool and manager.

The total size of a structure using string fields didn't change, whether using
extended fields or not, nor have the offsets of any structure members, either
inside the original block or outside.  Adding an extended field to the end of a
structure is the same as adding a char *.

Details:

The stringfield C code was pulled out from utils.c and into stringfields.c.
It just made sense.

Additional work was done in ast_string_field_init and
ast_calloc_with_stringfields to handle the allocation of the new header
structure and the vector, and the associated cleanup.  In the process some
additional NULL pointer checking was added.

A lot of work was done in stringfields.h since the logic for compare and copy
is there.  Documentation was added as well as somne additional NULL checking.

The ability to call ast_calloc_with_stringfields with a number of structures
greater than 1 never really worked.  Well, the calloc worked but there was no
way to access the additional structures or clean them up.  It was agreed that
there was no use case for requesting more than 1 structure so an ast_assert
was added to prevent it and the iteration code removed.

Testing:

The stringfield unit tests were updated to test both normal and extended
fields.  Tests for ast_string_field_ptr_set_by_fields and
ast_calloc_with_stringfields were also added.

As an ABI test, 13 was compiled from git and the res_pjsip_* modules, except
res_pjsip itself, saved off.  The patch was then added and a full compile and
install was performed.  Then the older res_pjsip_* moduled were copied over the
installed versions so res_pjsip was new and the rest were old.  No issues.

contact->aor, which is a char * at the end of contact, was then changed to an
extended string field and a recompile and reinstall was performed, again
leaving stock versions of the the res_pjsip_* modules.  Again, no issues with
the res_pjsip_* modules using the old stringfield implementation and with
contact->aor as a char *, and res_pjsip itself using the new stringfield
implementation and contact->aor being an extended string field.

Finally, several existing string fields were converted to extended string
fields to test OPT_STRINGFIELD_T.  Again, no issues.

Change-Id: I235db338c5b178f5a13b7946afbaa5d4a0f91d61
2016-04-04 18:07:18 -06:00
George Joseph
566601837e utils.c: Fix typo in handle_show_locks
ast_cli_allow_on_shutdown(e) should have been ast_cli_allow_at_shutdown(e).

Change-Id: I4f092495c0b2bfd85c2651e0b5877bf4d05d9faf
2016-04-01 12:09:50 -06:00
Joshua Colp
b602886c6b Merge "pjproject_bundled: Fix use of LDCONFIG for shared library link creation" into 13 2016-03-31 12:35:49 -05:00
George Joseph
964f54bd5d pjproject_bundled: Fix use of LDCONFIG for shared library link creation
LDCONFIG apparently isn't set to something sane on all systems so the creation
of the shared library links fails.  Instead of just testing for non-blank,
main/Makefile now checks that LDCONFIG is actually executable and reverts to
LN if it isn't.

This applies to both libasteriskpj and libasteriskssl.

Thanks to 'abelbeck' for pointing out that the issue was LDCONFIG.

ASTERISK-25873 #close
Reported-by: Hans van Eijsden

Change-Id: I25b76379bc637726ec044b2c0e709b56b3701729
2016-03-30 17:43:56 -06:00
Richard Mudgett
7f53f1d89e core_unreal.c: Add clarification comment about channel ref.
Change-Id: I0be0627260cd8d6b6c3cc345949dcfdf32eff1f3
2016-03-30 16:26:36 -05:00
Joshua Colp
7b7e3909e4 Merge "sorcery/res_pjsip: Refactor for realtime performance" into 13 2016-03-29 13:16:17 -05:00
Jacek Konieczny
0cfab30b28 res_rtp_asterisk: Use separate SRTP session for RTCP with DTLS
Asterisk uses separate UDP ports for RTP and RTCP traffic and RFC 5764
explicitly states:

  There MUST be a separate DTLS-SRTP session for each distinct pair of
  source and destination ports used by a media session

This means RTP keying material cannot be used for DTLS RTCP, which was
the reason why RTCP encryption would fail.

ASTERISK-25642

Change-Id: I7e8779d8b63e371088081bb113131361b2847e3a
2016-03-29 09:29:45 -05:00
Richard Mudgett
50f90d4099 res_parking: Fix blind transfer dynamic lots creation.
Blind transfers to a recognized parking extension need to use the parker's
channel variable values to create the dynamic parking lot.  This is
because there is always only one parker while the parkee may actually be a
multi-party bridge.  A multi-party bridge can never supply the needed
channel variables to create the dynamic parking lot.  In the multi-party
bridge blind transfer scenario, the parker's CHANNEL(parkinglot) value and
channel variables are inherited by the local channel used to park the
bridge.

* In park_common_setup(), make use the parker instead of the parkee to
supply the dynamic parking lot channel variable values.  In all but one
case, the parkee is the same as the parker.  However, in the recognized
parking extension blind transfer scenario for a two party bridge they are
different channels.  For consistency, we need to use the parker channel.

* In park_local_transfer(), pass the CHANNEL(parkinglot) value to the
local channel when blind transferring a multi-party bridge to a recognized
parking extension.

* When a local channel starts a call, the Local;2 side needs to inherit
the CHANNEL(parkinglot) value from Local;1.

The DTMF one-touch parking case wasn't even trying to create dynamic
parking lots before it aborted the attempt.

* In parking_park_call(), add missing code to create a dynamic parking
lot.

A DTMF bridge hook is documented as returning -1 to remove the hook.
Though the hook caller is really coded to accept non-zero.  See the
ast_bridge_hook_callback typedef.

* In feature_park_call(), don't remove the DTMF one-touch parking hook
because of an error.

ASTERISK-24605 #close
Reported by:  Philip Correia
Patches:
      call_park.patch (license #6672) patch uploaded by Philip Correia

Change-Id: I221d3a8fcc181877a1158d17004474d35d8016c9
2016-03-26 02:43:48 -05:00
George Joseph
5aa5c49413 sorcery/res_pjsip: Refactor for realtime performance
There were a number of places in the res_pjsip stack that were getting
all endpoints or all aors, and then filtering them locally.

A good example is pjsip_options which, on startup, retrieves all
endpoints, then the aors for those endpoints, then tests the aors to see
if the qualify_frequency is > 0.  One issue was that it never did
anything with the endpoints other than retrieve the aors so we probably
could have skipped a step and just retrieved all aors. But nevermind.

This worked reasonably well with local config files but with a realtime
backend and thousands of objects, this was a nightmare.  The issue
really boiled down to the fact that while realtime supports predicates
that are passed to the database engine, the non-realtime sorcery
backends didn't.

They do now.

The realtime engines have a scheme for doing simple comparisons. They
take in an ast_variable (or list) for matching, and the name of each
variable can contain an operator.  For instance, a name of
"qualify_frequency >" and a value of "0" would create a SQL predicate
that looks like "where qualify_frequency > '0'".  If there's no operator
after the name, the engines add an '=' so a simple name of
"qualify_frequency" and a value of "10" would return exact matches.

The non-realtime backends decide whether to include an object in a
result set by calling ast_sorcery_changeset_create on every object in
the internal container.  However, ast_sorcery_changeset_create only does
exact string matches though so a name of "qualify_frequency >" and a
value of "0" returns nothing because the literal "qualify_frequency >"
doesn't match any name in the objset set.

So, the real task was to create a generic string matcher that can take a
left value, operator and a right value and perform the match. To that
end, strings.c has a new ast_strings_match(left, operator, right)
function.  Left and right are the strings to operate on and the operator
can be a string containing any of the following: = (or NULL or ""), !=,
>, >=, <, <=, like or regex.  If the operator is like or regex, the
right string should be a %-pattern or a regex expression.  If both left
and right can be converted to float, then a numeric comparison is
performed, otherwise a string comparison is performed.

To use this new function on ast_variables, 2 new functions were added to
config.c.  One that compares 2 ast_variables, and one that compares 2
ast_variable lists.  The former is useful when you want to compare 2
ast_variables that happen to be in a list but don't want to traverse the
list.  The latter will traverse the right list and return true if all
the variables in it match the left list.

Now, the backends' fields_cmp functions call ast_variable_lists_match
instead of ast_sorcery_changeset_create and they can now process the
same syntax as the realtime engines.  The realtime backend just passes
the variable list unaltered to the engine.  The only gotcha is that
there's no common realtime engine support for regex so that's been noted
in the api docs for ast_sorcery_retrieve_by_fields.

Only one more change to sorcery was done...  A new config flag
"allow_unqualified_fetch" was added to reg_sorcery_realtime.
"no": ignore fetches if no predicate fields were supplied.
"error": same as no but emit an error. (good for testing)
"yes": allow (the default);
"warn": allow but emit a warning. (good for testing)

Now on to res_pjsip...

pjsip_options was modified to retrieve aors with qualify_frequency > 0
rather than all endpoints then all aors.  Not only was this a big
improvement in realtime retrieval but even for config files there's an
improvement because we're not going through endpoints anymore.

res_pjsip_mwi was modified to retieve only endpoints with something in
the mailboxes field instead of all endpoints then testing mailboxes.

res_pjsip_registrar_expire was completely refactored.  It was retrieving
all contacts then setting up scheduler entries to check for expiration.
Now, it's a single thread (like keepalive) that periodically retrieves
only contacts whose expiration time is < now and deletes them.  A new
contact_expiration_check_interval was added to global with a default of
30 seconds.

Ross Beer reports that with this patch, his Asterisk startup time dropped
from around an hour to under 30 seconds.

There are still objects that can't be filtered at the database like
identifies, transports, and registrations.  These are not going to be
anywhere near as numerous as endpoints, aors, auths, contacts however.

Back to allow_unqualified_fetch.  If this is set to yes and you have a
very large number of objects in the database, the pjsip CLI commands
will attempt to retrive ALL of them if not qualified with a LIKE.
Worse, if you type "pjsip show endpoint <tab>" guess what's going to
happen? :)  Having a cache helps but all the objects will have to be
retrieved at least once to fill the cache.  Setting
allow_unqualified_fetch=no prevents the mass retrieve and should be used
on endpoints, auths, aors, and contacts.  It should NOT be used for
identifies, registrations and transports since these MUST be
retrieved in bulk.

Example sorcery.conf:

[res_pjsip]
endpoint=config,pjsip.conf,criteria=type=endpoint
endpoint=realtime,ps_endpoints,allow_unqualified_fetch=error

ASTERISK-25826 #close
Reported-by: Ross Beer
Tested-by: Ross Beer

Change-Id: Id2691e447db90892890036e663aaf907b2dc1c67
2016-03-25 19:19:39 -06:00
zuul
096e7a88ce Merge "core/logging: Fix broken syslog levels on older glibc." into 13 2016-03-25 13:38:37 -05:00
zuul
8271a06dde Merge "Restrict CLI/AMI commands on shutdown." into 13 2016-03-24 19:55:50 -05:00
Gianluca Merlo
c6e4c48e67 config: fix flags in uint option handler
The configuration unsigned integer option handler sets flags for the
parser as if the option should be a signed integer (PARSE_INT32),
leading to errors on "out of range" values. Fix flags (PARSE_UINT32).

A fix to res_pjsip is also present which stops invalid flags from
being passed when registering sorcery object fields for qualify
status.

ASTERISK-25612 #close

Change-Id: I96b539336275e0e72a8e8033487d2c3344debd3e
2016-03-24 13:14:33 -03:00
Mark Michelson
59c8e189fd Restrict CLI/AMI commands on shutdown.
During stress testing, we have frequently seen crashes occur because a
CLI or AMI command attempts to access information that is in the process
of being destroyed.

When addressing how to fix this issue, we initially considered fixing
individual crashes we observed. However, the changes required to fix
those problems would introduce considerable overhead to the nominal
case. This is not reasonable in order to prevent a crash from occurring
while Asterisk is already shutting down.

Instead, this change makes it so AMI and CLI commands cannot be executed
if Asterisk is being shut down. For AMI, this is absolute. For CLI,
though, certain commands can be registered so that they may be run
during Asterisk shutdown.

ASTERISK-25825 #close

Change-Id: I8887e215ac352fadf7f4c1e082da9089b1421990
2016-03-24 09:29:07 -05:00
Walter Doekes
82e55e4883 core/logging: Fix broken syslog levels on older glibc.
The fix to ASTERISK-25407 introduced the usage of LOG_MAKEPRI. However
this macro is broken in older glibc (< 2.17); it would left-shift the
facility a second time, causing the resultant priority to become
invalid.

The syslog manpage mentions nothing about LOG_MAKEPRI and suggests this:

    The priority argument is formed by ORing the facility and the level
    values [...].

ASTERISK-25510 #close
Reported by: Michael Newton

Change-Id: Ia89debe7fac5ad090c7ef595c0707f31bb1e3d03
2016-03-24 06:34:20 -05:00
Richard Mudgett
b2d2906445 sched.c: Ensure oldest expiring entry runs first.
This patch is part of a series to resolve deadlocks in chan_sip.c.

* Updated sched unit test to check new behavior.

ASTERISK-25023

Change-Id: Ib69437327b3cda5e14c4238d9ff91b2531b34ef3
2016-03-15 12:01:50 -05:00
Walter Doekes
336cae73cc app_chanspy: Fix occasional deadlock with ChanSpy and Local channels.
Channel masquerading had a conflict with autochannel locking.

When locking autochannel->channel, the channel is fetched from the
autochannel and then locked. During the fetch, the autochannel -- which
has no locks itself -- can be modified by someone who owns the channel
lock. That means that the value of autochan->channel cannot be trusted
until you hold the lock.

In practice, this caused problems with Local channels getting
masqueraded away while the ChanSpy attempted to get info from that
channel. The old channel which was about to get removed got locked, but
the new (replaced) channel got unlocked (no-op). Because the replaced
channel was now locked (and would never get unlocked), it couldn't get
removed from the channel list in a timely manner, and would now cause
deadlocks when iterating over the channel list.

This change checks the autochannel after locking the channel for changes
to the autochannel. If the channel had been changed, the lock is
reobtained on the new channel.

In theory it seems possible that after this fix, the lock attempt on the
old (wrong) channel can be on an already destroyed lock, maybe causing
a crash. But that hasn't been observed in the wild and is harder induce
than the current deadlock.

Thanks go to Filip Frank for suggesting a fix similar to this and
especially to IRC user hexanol for pointing out why this deadlock was
possible and testing this fix. And to Richard for catching my rookie
while loop mistake ;)

ASTERISK-25321 #close

Change-Id: I293ae0014e531cd0e675c3f02d1d118a98683def
2016-03-11 23:03:08 +01:00
zuul
9b0a96b947 Merge "loader: Retry dlopen when loading fails" into 13 2016-03-03 19:57:41 -06:00
George Joseph
53f57001f2 loader: Retry dlopen when loading fails
Although we use the RTLD_LAZY flag when calling dlopen
the first time on a module, this only defers resolution
for function calls.  Pointer references to functions are
determined at link time so dlopen expects them to be there.
Since we don't cross-module link, pointers to functions
in other modules won't be available and dlopen will fail.

Doing a "hardened" build also causes problems because it
typically sets "-z now" on the ld command line which
overrides RTLD_LAZY at run time.

If the failing module isn't a GLOBAL_SYMBOLS module, then
dlopen will be called again after all the GLOBAL_SYMBOLS
modules have been loaded and they'll eventually resolve.

If the calling module IS a GLOBAL_SYMBOLS module itself
and a third module depends on it, then there's an issue
because the second time through the dlopen loop,
GLOBAL_SYMBOLS modules aren't given any special treatment
and since the order in which dlopen is called isn't
deterministic, the dependent may again be tried before the
module it needs is loaded.

Simple solution:  Save modules that fail load_resource
because of a dlopen error in a list and retry them
immediately after the first pass. Keep retrying until
the failed list is empty or we reach a #defined max
retries. Error messages are suppressed until the final
pass which also gets rid of those confusing error messages
about module failures that are later corrected.

Change-Id: Iddae1d97cd2f00b94e61662447432765755f64bb
2016-03-03 15:37:48 -06:00
Kevin Harwell
40d9e9e238 bridge.c: Crash during attended transfer when missing a local channel half
It's possible for the transferer channel to get hung up early during the
attended transfer process. For instance, a phone may send a "bye" immediately
upon receiving a sip notify that contains a sip frag 100 (I'm looking at you
Jitsi). When this occurs a race begins between the transferer being hung up
and completion of the transfer code.

If the channel hangs up too early during a transfer involving stasis bridging
for instance, then when the created local channel goes to look up its swap
channel (and associated datastore) it can't find it (since it is no longer in
the bridge) thus it fails to enter the stasis application. Consequently, the
created local channel(s) hang up as well. If the timing is just right then the
bridging code attempts to add the message link with missing local channel(s).
Hence the crash.

Unfortunately, there is no great way to solve the problem of the unexpected
"bye". While we can't guarantee we won't receive an early hangup, and in this
case still fail to enter the stasis application, we can make it so asterisk
does not crash.

This patch does just that by locking the local channel structure, checking
that the local channel's peer has not been lost, and then continuing. This
keeps the local channel's peer from being ripped out from underneath it by
the local/unreal hangup code while attempting to set the stasis message link.

ASTERISK-25771

Change-Id: Ie6d6061e34c7c95f07116fffac9a09e5d225c880
2016-03-03 13:55:24 -06:00
zuul
9e896540c8 Merge "build-system: Allow building with static pjproject" into 13 2016-03-03 11:16:48 -06:00
Joshua Colp
86124f63c8 Merge "SIP diversion: Fix REDIRECTING(reason) value inconsistencies." into 13 2016-03-03 08:47:36 -06:00
Scott Griepentrog
1ea7a5a774 CHAOS: cleanup possible null vars on msg alloc failure
In message.c, if msg_alloc fails to init the string field,
vars may be null, so use a null tolerant cleanup.

In res_pjsip_messaging.c, if msg_data_create fails, mdata
will be null, so use a null tolerant cleanup.

ASTERISK-25323

Change-Id: Ic2d55c2c3750d5616e2a05ea92a19c717507ff56
2016-03-02 12:00:37 -06:00
Richard Mudgett
4165ea7778 SIP diversion: Fix REDIRECTING(reason) value inconsistencies.
Previous chan_sip behavior:

Before this patch chan_sip would always strip any quotes from an incoming
reason and pass that value up as the REDIRECTING(reason).  For an outgoing
reason value, chan_sip would check the value against known values and
quote any it didn't recognize.  Incoming 480 response message reason text
was just assigned to the REDIRECTING(reason).

Previous chan_pjsip behavior:

Before this patch chan_pjsip would always pass the incoming reason value
up as the REDIRECTING(reason).  For an outgoing reason value, chan_pjsip
would send the reason value as passed down.

With this patch:

Both channel drivers match incoming reason values with values documented
by REDIRECTING(reason) and values documented by RFC5806 regardless of
whether they are quoted or not.  RFC5806 values are mapped to the
equivalent REDIRECTING(reason) documented value and is set in
REDIRECTING(reason).  e.g., an incoming RFC5806 'unconditional' value or a
quoted string version ('"unconditional"') is converted to
REDIRECTING(reason)'s 'cfu' value.  The user's dialplan only needs to deal
with 'cfu' instead of any of the aliases.

The incoming 480 response reason text supported by chan_sip checks for
known reason values and if not matched then puts quotes around the reason
string and assigns that to REDIRECTING(reason).

Both channel drivers send outgoing known REDIRECTING(reason) values as the
unquoted RFC5806 equivalent.  User custom values are either sent as is or
with added quotes if SIP doesn't allow a character within the value as
part of a RFC3261 Section 25.1 token.  Note that there are still
limitations on what characters can be put in a custom user value.  e.g.,
embedding quotes in the middle of the reason string is silly and just
going to cause you grief.

* Setting a REDIRECTING(reason) value now recognizes RFC5806 aliases.
e.g., Setting REDIRECTING(reason) to 'unconditional' is converted to the
'cfu' value.

* Added missing malloc() NULL return check in res_pjsip_diversion.c
set_redirecting_reason().

* Fixed potential read from a stale pointer in res_pjsip_diversion.c
add_diversion_header().  The reason string needed to be copied into the
tdata memory pool to ensure that the string would always be available.
Otherwise, if the reason string returned by reason_code_to_str() was a
user's reason string then the string could be freed later by another
thread.

Change-Id: Ifba83d23a195a9f64d55b9c681d2e62476b68a87
2016-03-01 20:13:39 -06:00
George Joseph
b59956a875 build-system: Allow building with static pjproject
Background here:
http://lists.digium.com/pipermail/asterisk-dev/2016-January/075266.html

From CHANGES:
 * To help insure that Asterisk is compiled and run with the same known
   version of pjproject, a new option (--with-pjproject-bundled) has been
   added to ./configure.  When specified, the version of pjproject specified
   in third-party/versions.mak will be downloaded and configured.  When you
   make Asterisk, the build process will also automatically build pjproject
   and Asterisk will be statically linked to it.  Once a particular version
   of pjproject is configured and built, it won't be configured or built
   again unless you run a 'make distclean'.

   To facilitate testing, when 'make install' is run, the pjsua and pjsystest
   utilities and the pjproject python bindings will be installed in
   ASTDATADIR/third-party/pjproject.

   The default behavior remains building with the shared pjproject
   installation, if any.

Building:

   All you have to do is include the --with-pjproject-bundled option on
   the ./configure command line (and remove any existing --with-pjproject
   option if specified).  Everything else is automatic.

Behind the scenes:

   The top-level Makefile was modified to include 'third-party' in the
   list of MOD_SUBDIRS.

   The third-party directory was created to contain any third party
   packages that may be needed in the future.  Its Makefile automatically
   iterates over any subdirectories passing on targets.

   The third-party/pjproject directory was created to house the pjproject
   source distribution.  Its Makefile contains targets to download, patch
   configure, generate dependencies, compile libs, apps and python bindings,
   sanitized build.mak and generate a symbols list.

   When bootstrap.sh is run, it automatically includes the configure.m4
   file in third-party/pjproject.  This file has a macro to download and
   conifgure pjproject and get and set PJPROJECT_INCLUDE, PJPROJECT_DIR
   and PJPROJECT_BUNDLED.  It also tests for the capabilities like
   PJ_TRANSACTION_GRP_LOCK by parsing preprocessor output as opposed to
   trying to compile.  Of course, bootstrap.sh is only run once and the
   configure file is incldued in the patch.

   When configure is run with the new options, the macro in configure.m4
   triggers the download, patch, conifgure and tests.  No compilation is
   performed at this time.  The downloaded tarball is cached in /tmp so
   it doesn't get downloaded again on a distclean.

   When make is run in the top-level Asterisk source directory, it will
   automatically descend all the subdirectories in third_party just as it
   does for addons, apps, etc.  The top-level Makefile makes sure that
   the 'third-party' is built before 'main' so that dependencies from the
   other directories are built first.

   When main does build, a new shared library (libasteriskpj) is created that
   links statically to the pjproject .a files and exports all their symbols.
   The asterisk binary links to that, just as it does with libasteriskssl.

   When Asterisk is installed, the pjsua and pjsystest apps, and the pjproject
   python bindings are installed in ASTDATADIR/third-party/pjproject.  This
   will facilitate testing, including running the testsuite which will be
   updated to check that directory for the pjsua module ahead of the system
   python library.

Modules should continue to depend on pjproject if they use pjproject APIs
directly.  They should not care about the implementation.  No changes to any
res_pjsip modules were made.

Change-Id: Ia7a60c28c2e9ba9537c5570f933c1ebcb20a3103
2016-03-01 09:33:17 -07:00
zuul
5393a4963a Merge "bridge core: Add owed T.38 terminate when channel leaves a bridge." into 13 2016-02-29 18:09:53 -06:00
zuul
dbf52dd7d7 Merge "channel api: Create is_t38_active accessor functions." into 13 2016-02-29 18:00:19 -06:00